Emergent Mind

Bounds on the Number of Huffman and Binary-Ternary Trees

Published Mar 21, 2013 in cs.IT and math.IT


Huffman coding is a widely used method for lossless data compression because it optimally stores data based on how often the characters occur in Huffman trees. An $n$-ary Huffman tree is a connected, cycle-lacking graph where each vertex can have either $n$ "children" vertices connecting to it, or 0 children. Vertices with 0 children are called \textit{leaves}. We let $hn(q)$ represent the total number of $n$-ary Huffman trees with $q$ leaves. In this paper, we use a recursive method to generate upper and lower bounds on $hn(q)$ and get $h_2(q) \approx (0.1418532)(1.7941471)q+(0.0612410)(1.2795491)q$ for $n=2$. This matches the best results achieved by Elsholtz, Heuberger, and Prodinger in August 2011. Our approach reveals patterns in Huffman trees that we used in our analysis of the Binary-Ternary (BT) trees we created. Our research opens a completely new door in data compression by extending the study of Huffman trees to BT trees. Our study of BT trees paves the way for designing data-specific trees, minimizing possible wasted storage space from Huffman coding. We prove a recursive formula for the number of BT trees with $q$ leaves. Furthermore, we provide analysis and further proofs to reach numeric bounds. Our discoveries have broad applications in computer data compression. These results also improve graphical representations of protein sequences that facilitate in-depth genome analysis used in researching evolutionary patterns.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.