Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 398 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer (2403.03736v1)

Published 6 Mar 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivated by the capabilities of predictive LLMs for lossless compression, this paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression. A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization, alongside a multi-stage transformer designed to exploit spatial contextual information for modeling the prior distribution. As such, the dual-purpose framework effectively utilizes the learned prior for entropy estimation and assists in the regeneration of lost tokens. Extensive experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception, particularly in ultra-low bitrate scenarios (<=0.03 bpp), pioneering a new direction in generative compression.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. “Overview of the versatile video coding (vvc) standard and its applications,” TCSVT, 2021.
  2. “Joint autoregressive and hierarchical priors for learned image compression,” in NIPS, 2018.
  3. “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in CVPR, June 2020.
  4. “Checkerboard context model for efficient learned image compression,” in CVPR, 2021.
  5. “High-efficiency lossy image coding through adaptive neighborhood information aggregation,” arXiv preprint, 2022.
  6. “Generative adversarial networks for extreme learned image compression,” in ICCV, 2019.
  7. “High-fidelity generative image compression,” in NIPS, 2020.
  8. “Fidelity-controllable extreme image compression with generative adversarial networks,” in ICPR, 2021.
  9. “Multi-realism image compression with a conditional generator,” in CVPR, 2023.
  10. “Layered conceptual image compression via deep semantic synthesis,” in ICIP, 2019.
  11. “Conceptual compression via deep structure and texture synthesis,” TIP, 2022.
  12. “Thousand to one: Semantic prior modeling for conceptual coding,” in ICME, 2021.
  13. “Semantic-aware visual decomposition for image coding,” IJCV, 2023.
  14. “Extreme image compression using fine-tuned vqgans,” arXiv preprint, 2023.
  15. “Auto-encoding variational bayes,” arXiv preprint, 2013.
  16. “Generative adversarial nets,” NIPS, vol. 27, 2014.
  17. “Denoising diffusion probabilistic models,” NIPS, 2020.
  18. “Language modeling is compression,” arXiv preprint, 2023.
  19. “Taming transformers for high-resolution image synthesis,” in CVPR, 2021.
  20. “Maskgit: Masked generative image transformer,” in CVPR, 2022.
  21. Eastman Kodak, “Kodak photocd dataset,” 2013.
  22. “Workshop and challenge on learned image compression (clic2020),” 2020.
  23. “Holistically-nested edge detection,” in ICCV, 2015.
  24. “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009.
  25. Gisle Bjontegaard, “Calculation of average psnr differences between rd-curves,” ITU SG16 Doc. VCEG-M33, 2001.
  26. “Zlib compression library,” 2004.
  27. “Compressai: a pytorch library and evaluation platform for end-to-end compression research,” arXiv preprint arXiv:2011.03029, 2020.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube