Single-step Diffusion for Image Compression at Ultra-Low Bitrates
Chanung Park, Joo Chan Lee, Jong Hwan Ko

TL;DR
This paper introduces a single-step diffusion model for ultra-low bitrate image compression that achieves high perceptual quality and significantly faster decoding, making generative codecs more practical.
Contribution
The paper presents a novel single-step diffusion approach with VQ-Residual training and rate-aware noise modulation for efficient, high-quality image compression at ultra-low bitrates.
Findings
Decodes images 50x faster than previous diffusion methods.
Maintains competitive compression performance with state-of-the-art techniques.
Enhances perceptual quality at extremely low bitrates.
Abstract
Although there have been significant advancements in image compression techniques, such as standard and learned codecs, these methods still suffer from severe quality degradation at extremely low bits per pixel. While recent diffusion-based models provided enhanced generative performance at low bitrates, they often yields limited perceptual quality and prohibitive decoding latency due to multiple denoising steps. In this paper, we propose the single-step diffusion model for image compression that delivers high perceptual quality and fast decoding at ultra-low bitrates. Our approach incorporates two key innovations: (i) Vector-Quantized Residual (VQ-Residual) training, which factorizes a structural base code and a learned residual in latent space, capturing both global geometry and high-frequency details; and (ii) rate-aware noise modulation, which tunes denoising strength to match the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Image and Signal Denoising Methods · Digital Filter Design and Implementation
MethodsDiffusion · Balanced Selection · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
