Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast Regression with an $\ell_\infty$ Guarantee (1705.10723v1)

Published 30 May 2017 in cs.DS and cs.LG

Abstract: Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an $n \times d$ matrix $A$, with $n \gg d$, as well as an $n \times 1$ vector $b$, and one wants to find a vector $\hat{x}$ so as to minimize the residual error $|Ax-b|2$. Using the sketch and solve paradigm, one first computes $S \cdot A$ and $S \cdot b$ for a randomly chosen matrix $S$, then outputs $x' = (SA){\dagger} Sb$ so as to minimize $|SAx' - Sb|_2$. The sketch-and-solve paradigm gives a bound on $|x'-x*|_2$ when $A$ is well-conditioned. Our main result is that, when $S$ is the subsampled randomized Fourier/Hadamard transform, the error $x' - x*$ behaves as if it lies in a "random" direction within this bound: for any fixed direction $a\in \mathbb{R}d$, we have with $1 - d{-c}$ probability that [ \langle a, x'-x*\rangle \lesssim \frac{|a|_2|x'-x*|_2}{d{\frac{1}{2}-\gamma}}, \quad (1) ] where $c, \gamma > 0$ are arbitrary constants. This implies $|x'-x*|{\infty}$ is a factor $d{\frac{1}{2}-\gamma}$ smaller than $|x'-x*|_2$. It also gives a better bound on the generalization of $x'$ to new examples: if rows of $A$ correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings $S$ satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small $|x'-x*|_2$ can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.

Citations (7)

Summary

We haven't generated a summary for this paper yet.