Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 137 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 90 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Performance of Bayesian linear regression in a model with mismatch (2107.06936v2)

Published 14 Jul 2021 in math.PR, cond-mat.dis-nn, cs.IT, cs.LG, math-ph, math.IT, math.MP, math.ST, and stat.TH

Abstract: In this paper we analyze, for a model of linear regression with gaussian covariates, the performance of a Bayesian estimator given by the mean of a log-concave posterior distribution with gaussian prior, in the high-dimensional limit where the number of samples and the covariates' dimension are large and proportional. Although the high-dimensional analysis of Bayesian estimators has been previously studied for Bayesian-optimal linear regression where the correct posterior is used for inference, much less is known when there is a mismatch. Here we consider a model in which the responses are corrupted by gaussian noise and are known to be generated as linear combinations of the covariates, but the distributions of the ground-truth regression coefficients and of the noise are unknown. This regression task can be rephrased as a statistical mechanics model known as the Gardner spin glass, an analogy which we exploit. Using a leave-one-out approach we characterize the mean-square error for the regression coefficients. We also derive the log-normalizing constant of the posterior. Similar models have been studied by Shcherbina and Tirozzi and by Talagrand, but our arguments are much more straightforward. An interesting consequence of our analysis is that in the quadratic loss case, the performance of the Bayesian estimator is independent of a global "temperature" hyperparameter and matches the ridge estimator: sampling and optimizing are equally good.

Citations (21)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.