Adaptive Estimation of Shannon Entropy (1502.00326v2)

Published 1 Feb 2015 in cs.IT and math.IT

Abstract: We consider estimating the Shannon entropy of a discrete distribution $P$ from $n$ i.i.d. samples. Recently, Jiao, Venkat, Han, and Weissman, and Wu and Yang constructed approximation theoretic estimators that achieve the minimax $L_2$ rates in estimating entropy. Their estimators are consistent given $n \gg \frac{S}{\ln S}$ samples, where $S$ is the alphabet size, and it is the best possible sample complexity. In contrast, the Maximum Likelihood Estimator (MLE), which is the empirical entropy, requires $n\gg S$ samples. In the present paper we significantly refine the minimax results of existing work. To alleviate the pessimism of minimaxity, we adopt the adaptive estimation framework, and show that the minimax rate-optimal estimator in Jiao, Venkat, Han, and Weissman achieves the minimax rates simultaneously over a nested sequence of subsets of distributions $P$, without knowing the alphabet size $S$ or which subset $P$ lies in. In other words, their estimator is adaptive with respect to this nested sequence of the parameter space, which is characterized by the entropy of the distribution. We also characterize the maximum risk of the MLE over this nested sequence, and show, for every subset in the sequence, that the performance of the minimax rate-optimal estimator with $n$ samples is essentially that of the MLE with $n\ln n$ samples, thereby further substantiating the generality of the phenomenon identified by Jiao, Venkat, Han, and Weissman.

Citations (28)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Adaptive Estimation of Shannon Entropy (1502.00326v2)

Summary

Related Papers