Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 82 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 117 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 469 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Estimating the Effective Support Size in Constant Query Complexity (2211.11344v1)

Published 21 Nov 2022 in cs.DS, math.ST, and stat.TH

Abstract: Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the $\epsilon$-\emph{effective support size} $\text{Ess}\epsilon$ of a distribution ${P}$, which is equal to the smallest support size of a distribution that is $\epsilon$-far in total variation distance from ${P}$. In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability $p(x)$ for any $x$) for a bicriteria approximation, giving an answer in $[\text{Ess}{(1+\beta)\epsilon},(1+\gamma) \text{Ess}{\epsilon}]$ for some values $\beta, \gamma > 0$. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio $1+\gamma = \omega(1)$. He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size. We answer his question by showing that not only is complexity independent of $n$ possible for $\gamma>0$, but also for $\gamma=0$, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity $O(\frac{1}{\beta3 \epsilon3})$. That is, for any $0 < \epsilon, \beta < 1$, we output in this complexity a number $\tilde{n} \in [\text{Ess}{(1+\beta)\epsilon},\text{Ess}_\epsilon]$. We also show that it is possible to solve the approximate version with approximation ratio $1+\gamma$ in complexity $O\left(\frac{1}{\beta2 \epsilon} + \frac{1}{\beta \epsilon \gamma2}\right)$. Our algorithm is very simple, and has $4$ short lines of pseudocode.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.