Emergent Mind

High-Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transition

Published Jan 16, 2017 in stat.ML , math.PR , math.ST , and stat.TH


We consider a sparse linear regression model Y=X\beta{*}+W where X has a Gaussian entries, W is the noise vector with mean zero Gaussian entries, and \beta{*} is a binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error \min{\beta}|Y-X\beta|{2}, where the minimization is over all k-sparse binary vectors \beta. The approximation reveals interesting structural properties of the underlying regression problem. In particular, a) We establish that n*=2k\log p/\log (2k/\sigma{2}+1) is a phase transition point with the following "all-or-nothing" property. When n exceeds n{*}, (2k){-1}|\beta{2}-\beta*|0\approx 0, and when n is below n{*}, (2k){-1}|\beta{2}-\beta*|0\approx 1, where \beta2 is the optimal solution achieving the smallest squared error. With this we prove that n{*} is the asymptotic threshold for recovering \beta* information theoretically. b) We compute the squared error for an intermediate problem \min{\beta}|Y-X\beta|{2} where minimization is restricted to vectors \beta with |\beta-\beta{*}|0=2k \zeta, for \zeta\in [0,1]. We show that a lower bound part \Gamma(\zeta) of the estimate, which corresponds to the estimate based on the first moment method, undergoes a phase transition at three different thresholds, namely n{\text{inf,1}}=\sigma2\log p, which is information theoretic bound for recovering \beta* when k=1 and \sigma is large, then at n{*} and finally at n{\text{LASSO/CS}}. c) We establish a certain Overlap Gap Property (OGP) on the space of all binary vectors \beta when n\le ck\log p for sufficiently small constant c. We conjecture that OGP is the source of algorithmic hardness of solving the minimization problem \min{\beta}|Y-X\beta|{2} in the regime n<n_{\text{LASSO/CS}}.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.