Performance of Bayesian linear regression in a model with mismatch (2107.06936v2)

Published 14 Jul 2021 in math.PR, cond-mat.dis-nn, cs.IT, cs.LG, math-ph, math.IT, math.MP, math.ST, and stat.TH

Abstract: In this paper we analyze, for a model of linear regression with gaussian covariates, the performance of a Bayesian estimator given by the mean of a log-concave posterior distribution with gaussian prior, in the high-dimensional limit where the number of samples and the covariates' dimension are large and proportional. Although the high-dimensional analysis of Bayesian estimators has been previously studied for Bayesian-optimal linear regression where the correct posterior is used for inference, much less is known when there is a mismatch. Here we consider a model in which the responses are corrupted by gaussian noise and are known to be generated as linear combinations of the covariates, but the distributions of the ground-truth regression coefficients and of the noise are unknown. This regression task can be rephrased as a statistical mechanics model known as the Gardner spin glass, an analogy which we exploit. Using a leave-one-out approach we characterize the mean-square error for the regression coefficients. We also derive the log-normalizing constant of the posterior. Similar models have been studied by Shcherbina and Tirozzi and by Talagrand, but our arguments are much more straightforward. An interesting consequence of our analysis is that in the quadratic loss case, the performance of the Bayesian estimator is independent of a global "temperature" hyperparameter and matches the ridge estimator: sampling and optimizing are equally good.

Authors (4)

Jean Barbier (60 papers)
Wei-Kuo Chen (45 papers)
Dmitry Panchenko (35 papers)
Manuel Sáenz (7 papers)

Citations (21)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Performance of Bayesian linear regression in a model with mismatch (2107.06936v2)

Summary

Related Papers