Emergent Mind

ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels

(1910.10923)
Published Oct 24, 2019 in math.ST , stat.ML , and stat.TH

Abstract

We study Empirical Risk Minimizers (ERM) and Regularized Empirical Risk Minimizers (RERM) for regression problems with convex and $L$-Lipschitz loss functions. We consider a setting where $|\cO|$ malicious outliers contaminate the labels. In that case, under a local Bernstein condition, we show that the $L2$-error rate is bounded by $ rN + AL |\cO|/N$, where $N$ is the total number of observations, $rN$ is the $L2$-error rate in the non-contaminated setting and $A$ is a parameter coming from the local Bernstein condition. When $rN$ is minimax-rate-optimal in a non-contaminated setting, the rate $rN + AL|\cO|/N$ is also minimax-rate-optimal when $|\cO|$ outliers contaminate the label. The main results of the paper can be used for many non-regularized and regularized procedures under weak assumptions on the noise. We present results for Huber's M-estimators (without penalization or regularized by the $\ell_1$-norm) and for general regularized learning problems in reproducible kernel Hilbert spaces when the noise can be heavy-tailed.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.