Optimal Lempel-Ziv based lossy compression for memoryless data: how to make the right mistakes (1210.4700v2)

Published 17 Oct 2012 in cs.IT and math.IT

Abstract: Compression refers to encoding data using bits, so that the representation uses as few bits as possible. Compression could be lossless: i.e. encoded data can be recovered exactly from its representation) or lossy where the data is compressed more than the lossless case, but can still be recovered to within prespecified distortion metric. In this paper, we prove the optimality of Codelet Parsing, a quasi-linear time algorithm for lossy compression of sequences of bits that are independently and identically distributed (\iid) and Hamming distortion. Codelet Parsing extends the lossless Lempel Ziv algorithm to the lossy case---a task that has been a focus of the source coding literature for better part of two decades now. Given \iid sequences $\x$, the expected length of the shortest lossy representation such that $\x$ can be reconstructed to within distortion $\dist$ is given by the rate distortion function, $\rd$. We prove the optimality of the Codelet Parsing algorithm for lossy compression of memoryless bit sequences. It splits the input sequence naturally into phrases, representing each phrase by a codelet, a potentially distorted phrase of the same length. The codelets in the lossy representation of a length-$n$ string ${\x}$ have length roughly $(\log n)/\rd$, and like the lossless Lempel Ziv algorithm, Codelet Parsing constructs codebooks logarithmic in the sequence length.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Optimal Lempel-Ziv based lossy compression for memoryless data: how to make the right mistakes (1210.4700v2)

Summary

Related Papers