Emergent Mind

Statistically Optimal Robust Mean and Covariance Estimation for Anisotropic Gaussians

(2301.09024)
Published Jan 21, 2023 in math.ST , cs.DS , cs.LG , and stat.TH

Abstract

Assume that $X{1}, \ldots, X{N}$ is an $\varepsilon$-contaminated sample of $N$ independent Gaussian vectors in $\mathbb{R}d$ with mean $\mu$ and covariance $\Sigma$. In the strong $\varepsilon$-contamination model we assume that the adversary replaced an $\varepsilon$ fraction of vectors in the original Gaussian sample by any other vectors. We show that there is an estimator $\widehat \mu$ of the mean satisfying, with probability at least $1 - \delta$, a bound of the form [ |\widehat{\mu} - \mu|_2 \le c\left(\sqrt{\frac{\operatorname{Tr}(\Sigma)}{N}} + \sqrt{\frac{|\Sigma|\log(1/\delta)}{N}} + \varepsilon\sqrt{|\Sigma|}\right), ] where $c > 0$ is an absolute constant and $|\Sigma|$ denotes the operator norm of $\Sigma$. In the same contaminated Gaussian setup, we construct an estimator $\widehat \Sigma$ of the covariance matrix $\Sigma$ that satisfies, with probability at least $1 - \delta$, [ \left|\widehat{\Sigma} - \Sigma\right| \le c\left(\sqrt{\frac{|\Sigma|\operatorname{Tr}(\Sigma)}{N}} + |\Sigma|\sqrt{\frac{\log(1/\delta)}{N}} + \varepsilon|\Sigma|\right). ] Both results are optimal up to multiplicative constant factors. Despite the recent significant interest in robust statistics, achieving both dimension-free bounds in the canonical Gaussian case remained open. In fact, several previously known results were either dimension-dependent and required $\Sigma$ to be close to identity, or had a sub-optimal dependence on the contamination level $\varepsilon$. As a part of the analysis, we derive sharp concentration inequalities for central order statistics of Gaussian, folded normal, and chi-squared distributions.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.