- The paper introduces Integer Discrete Flows, a novel flow-based generative model tailored for ordinal discrete data.
- It employs integer discrete coupling layers and tractable discrete distributions to bypass quantization errors and optimize compression.
- Experimental results show that IDFs outperform traditional methods on benchmarks like CIFAR10 and ImageNet, ensuring efficient and robust lossless compression.
Integer Discrete Flows and Lossless Compression
Introduction
The paper "Integer Discrete Flows and Lossless Compression" (1905.07376) explores a novel approach to lossless compression utilizing flow-based generative models specifically designed for ordinal discrete data. Lossless compression is crucial in contexts requiring perfect information preservation, such as medical imaging and storage. Conventional approaches often face difficulties due to assumptions of continuous data, which are not suitable for discrete settings. This work introduces Integer Discrete Flows (IDFs), capable of effectively modeling high-dimensional ordinal data with strong empirical results on image datasets like CIFAR10, ImageNet32, and ImageNet64.
Integer Discrete Flows
IDFs reframe flow-based generative modeling to accommodate discrete data through integer discrete coupling layers. Traditional flows use continuous changes in variables, which lead to challenges in compression due to quantization errors. IDFs offer a bijective map from discrete ordinal data, maintaining data integrity during compression. This model avoids the pitfalls of reconstruction errors seen in previous continuous models repurposed for discrete data.
Figure 1: Overview of IDF based lossless compression. An image x is transformed to a latent representation z with a tractable distribution pZ​(⋅). An entropy encoder takes z and pZ​(⋅) as input, and produces a bitstream c. To obtain x, the decoder uses pZ​(⋅) and c to reconstruct z. Subsequently, z is mapped to x using the inverse of the IDF.
Methodology
The core of the IDF approach involves clever use of integer discrete coupling layers for transformation. These are designed to be invertible mappings allowing data to stay within its discrete space, benefiting from efficient encoding algorithms like rANS for achieving high compression rates. An appealing feature is the method’s ability to bypass normal distribution pitfalls associated with traditional flow models, enabling direct application to pixel-level image compression without any quantization.
Key advancements also include Tractable Discrete Distributions and Lower Triangular Coupling techniques, which optimize the encoding and decoding processes, ensuring computational efficiency and robust performance across different depths and network architectures.


Figure 2: Left: An example from the ER + BCa histology dataset. Right: 625 IDF samples of size 80×80px.
Experimental Results
IDFs were empirically validated against several benchmarks, showing superior compression rates across standard datasets. Particularly noticeable is the IDF's performance on CIFAR10 and image patches from the histology dataset, where IDFs surpassed formats like JPEG2000 and Bit-Swap.
Figure 3: Progressive display of the data stream for images taken from the test set of ImageNet64. From top to bottom row, each image uses approximately 15\%, 30\%, 60\% and 100\% of the stream, where the remaining dimensions are sampled. Best viewed electronically.
Theoretical and Practical Implications
The research outlines a promising avenue for lossless compression through neural networks that respect the discrete nature of digital data. The potential applications extend beyond images to video and audio, offering a new perspective on digital media handling. Future work may explore the expansion of IDF frameworks to accommodate more complex data structures or enhance current coupling mechanisms to reduce computational demands further.
Conclusion
Integer Discrete Flows signify a pivotal shift in compression strategies for discrete data, marrying statistical modeling with machine-driven architectures to achieve high-efficiency lossless compression. This work presents a compelling case for rethinking generative models to operate within discrete parameters, opening doors for enhanced digital media processing across varied applications.