Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Estimating Entropy of Distributions in Constant Space (1911.07976v1)

Published 18 Nov 2019 in cs.IT, cs.DS, cs.LG, and math.IT

Abstract: We consider the task of estimating the entropy of $k$-ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires $O\left(\frac{k \log (1/\varepsilon)2}{\varepsilon3}\right)$ samples and a constant $O(1)$ memory words of space and outputs a $\pm\varepsilon$ estimate of $H(p)$. Without space limitations, the sample complexity has been established as $S(k,\varepsilon)=\Theta\left(\frac k{\varepsilon\log k}+\frac{\log2 k}{\varepsilon2}\right)$, which is sub-linear in the domain size $k$, and the current algorithms that achieve optimal sample complexity also require nearly-linear space in $k$. Our algorithm partitions $[0,1]$ into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade off the bias and variance of these estimates.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jayadev Acharya (45 papers)
  2. Sourbh Bhadane (7 papers)
  3. Piotr Indyk (66 papers)
  4. Ziteng Sun (29 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.