Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels (1910.10923v2)

Published 24 Oct 2019 in math.ST, stat.ML, and stat.TH

Abstract: We study Empirical Risk Minimizers (ERM) and Regularized Empirical Risk Minimizers (RERM) for regression problems with convex and $L$-Lipschitz loss functions. We consider a setting where $|\cO|$ malicious outliers contaminate the labels. In that case, under a local Bernstein condition, we show that the $L_2$-error rate is bounded by $ r_N + AL |\cO|/N$, where $N$ is the total number of observations, $r_N$ is the $L_2$-error rate in the non-contaminated setting and $A$ is a parameter coming from the local Bernstein condition. When $r_N$ is minimax-rate-optimal in a non-contaminated setting, the rate $r_N + AL|\cO|/N$ is also minimax-rate-optimal when $|\cO|$ outliers contaminate the label. The main results of the paper can be used for many non-regularized and regularized procedures under weak assumptions on the noise. We present results for Huber's M-estimators (without penalization or regularized by the $\ell_1$-norm) and for general regularized learning problems in reproducible kernel Hilbert spaces when the noise can be heavy-tailed.

Citations (12)

Summary

We haven't generated a summary for this paper yet.