Emergent Mind

Testing $k$-Modal Distributions: Optimal Algorithms via Reductions

(1112.5659)
Published Dec 23, 2011 in cs.DS , math.PR , math.ST , and stat.TH

Abstract

We give highly efficient algorithms, and almost matching lower bounds, for a range of basic statistical problems that involve testing and estimating the L1 distance between two k-modal distributions $p$ and $q$ over the discrete domain ${1,\dots,n}$. More precisely, we consider the following four problems: given sample access to an unknown k-modal distribution $p$, Testing identity to a known or unknown distribution: 1. Determine whether $p = q$ (for an explicitly given k-modal distribution $q$) versus $p$ is $\eps$-far from $q$; 2. Determine whether $p=q$ (where $q$ is available via sample access) versus $p$ is $\eps$-far from $q$; Estimating $L1$ distance ("tolerant testing'') against a known or unknown distribution: 3. Approximate $d{TV}(p,q)$ to within additive $\eps$ where $q$ is an explicitly given k-modal distribution $q$; 4. Approximate $d{TV}(p,q)$ to within additive $\eps$ where $q$ is available via sample access. For each of these four problems we give sub-logarithmic sample algorithms, that we show are tight up to additive $\poly(k)$ and multiplicative $\polylog\log n+\polylog k$ factors. Thus our bounds significantly improve the previous results of \cite{BKR:04}, which were for testing identity of distributions (items (1) and (2) above) in the special cases k=0 (monotone distributions) and k=1 (unimodal distributions) and required $O((\log n)3)$ samples. As our main conceptual contribution, we introduce a new reduction-based approach for distribution-testing problems that lets us obtain all the above results in a unified way. Roughly speaking, this approach enables us to transform various distribution testing problems for k-modal distributions over ${1,\dots,n}$ to the corresponding distribution testing problems for unrestricted distributions over a much smaller domain ${1,\dots,\ell}$ where $\ell = O(k \log n).$

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.