Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably (2202.03535v1)
Abstract: We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y*\in R{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. We parameterize the rank one matrix $Y*$ by $XX\top$, where $X\in R{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma2/d)$, where $\sigma2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.