Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice Problems (2207.14030v2)

Published 28 Jul 2022 in cs.LG, cs.CC, math.ST, stat.ML, and stat.TH

Abstract: We show hardness of improperly learning halfspaces in the agnostic model, both in the distribution-independent as well as the distribution-specific setting, based on the assumption that worst-case lattice problems, such as GapSVP or SIVP, are hard. In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace, achieving misclassfication error better than $\frac 1 2 - \gamma$ even if the optimal misclassification error is as small is as small as $\delta$. Here, $\gamma$ can be smaller than the inverse of any polynomial in the dimension and $\delta$ as small as $exp(-\Omega(\log{1-c}(d)))$, where $0 < c < 1$ is an arbitrary constant and $d$ is the dimension. For the distribution-specific setting, we show that if the marginal distribution is standard Gaussian, for any $\beta > 0$ learning halfspaces up to error $OPT_{LTF} + \epsilon$ takes time at least $d{\tilde{\Omega}(1/\epsilon{2-\beta})}$ under the same hardness assumptions. Similarly, we show that learning degree-$\ell$ polynomial threshold functions up to error $OPT_{{PTF}\ell} + \epsilon$ takes time at least $d{\tilde{\Omega}(\ell{2-\beta}/\epsilon{2-\beta})}$. $OPT{LTF}$ and $OPT_{{PTF}_\ell}$ denote the best error achievable by any halfspace or polynomial threshold function, respectively. Our lower bounds qualitively match algorithmic guarantees and (nearly) recover known lower bounds based on non-worst-case assumptions. Previously, such hardness results [Daniely16, DKPZ21] were based on average-case complexity assumptions or restricted to the statistical query model. Our work gives the first hardness results basing these fundamental learning problems on worst-case complexity assumptions. It is inspired by a sequence of recent works showing hardness of learning well-separated Gaussian mixtures based on worst-case lattice problems.

Citations (13)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)