A Lower Bound for Estimating High Moments of a Data Stream (1201.0253v1)

Published 31 Dec 2011 in cs.DS and cs.CC

Abstract: We show an improved lower bound for the Fp estimation problem in a data stream setting for p>2. A data stream is a sequence of items from the domain [n] with possible repetitions. The frequency vector x is an n-dimensional non-negative integer vector x such that x(i) is the number of occurrences of i in the sequence. Given an accuracy parameter Omega(n^{-1/p}) < \epsilon < 1, the problem of estimating Fp is to estimate \norm{x}p^p = \sum{i \in [n]} \abs{x(i)}^p correctly to within a relative accuracy of 1\pm \epsilon with high constant probability in an online fashion and using as little space as possible. The current space lower bound for this problem is Omega(n^{1-2/p} \epsilon^{-2/p}+ n^{{1-2/p}\epsilon^{-4/p}/} \log^{O(1)}(n)+ (\epsilon^{-2} + \log (n))). The first term in the lower bound expression was proved in \cite{B-YJKS:stoc02,cks:ccc03}, the second in \cite{wz:arxiv11} and the third in \cite{wood:soda04}. In this note, we show an Omega(p² n^{1-2/p} \epsilon^{-2}/\log (n)) bits space bound, for Omega(pn^{-1/p}) \le \epsilon \le 1/10.