Emergent Mind

Memory-Sample Tradeoffs for Linear Regression with Small Error

(1904.08544)
Published Apr 18, 2019 in cs.LG and stat.ML

Abstract

We consider the problem of performing linear regression over a stream of $d$-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples $(a1,b1), (a2,b2)\ldots,$ with $ai$ drawn independently from a $d$-dimensional isotropic Gaussian, and where $bi = \langle ai, x\rangle + \etai,$ for a fixed $x \in \mathbb{R}d$ with $|x|2 = 1$ and with independent noise $\etai$ drawn uniformly from the interval $[-2{-d/5},2{-d/5}].$ We show that any algorithm with at most $d2/4$ bits of memory requires at least $\Omega(d \log \log \frac{1}{\epsilon})$ samples to approximate $x$ to $\ell_2$ error $\epsilon$ with probability of success at least $2/3$, for $\epsilon$ sufficiently small as a function of $d$. In contrast, for such $\epsilon$, $x$ can be recovered to error $\epsilon$ with probability $1-o(1)$ with memory $O\left(d2 \log(1/\epsilon)\right)$ using $d$ examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.