Lossy Compression with Pretrained Diffusion Models
Jeremy Vonderfecht, Feng Liu

TL;DR
This paper demonstrates that pretrained diffusion models like Stable Diffusion can be effectively used for lossy image compression, achieving competitive results at ultra-low bitrates with fast processing times.
Contribution
The authors provide the first complete implementation of DiffC for pretrained diffusion models, enabling rapid lossy image compression without additional training.
Findings
Capable of compressing and decompressing images in under 10 seconds.
Achieves competitive performance at ultra-low bitrates.
Utilizes pretrained models without additional training.
Abstract
We apply the DiffC algorithm (Theis et al. 2022) to Stable Diffusion 1.5, 2.1, XL, and Flux-dev, and demonstrate that these pretrained models are remarkably capable lossy image compressors. A principled algorithm for lossy compression using pretrained diffusion models has been understood since at least Ho et al. 2020, but challenges in reverse-channel coding have prevented such algorithms from ever being fully implemented. We introduce simple workarounds that lead to the first complete implementation of DiffC, which is capable of compressing and decompressing images using Stable Diffusion in under 10 seconds. Despite requiring no additional training, our method is competitive with other state-of-the-art generative compression methods at low ultra-low bitrates.
Peer Reviews
Decision·ICLR 2025 Poster
The paper implemented the SOTA stable diffusion for single image compression usage, and also released an implementation of DiffC to the public.
First, I don't think the main contribution is significant. The major idea had already been proposed several years ago. Second, I failed to find the comparisons of this idea with other SOTA image compression methods.
Originality: The authors try their best to implement the DiffC and apply it to existing pre-trained diffusion models, such as Stable Diffusion 1.5, 2, and XL. In addition, they propose a greedy optimization technique to speed up the diffusion process and to select the best denoising timestep schedule. Quality: The manuscript is well organized and written. Clarity: The authors have explained their method in detail. Significance: The significance of this work is profound because it addresses th
(1)The manuscript looks more like a technology report than an academic paper. (2)The authors do not provide quantitative comparisons with state-of-the-art extreme image compression methods (VQ-based methods, diffusion-based methods) and show the advantages of the proposed method. (3) Although the main contribution of the paper is the implementation of the DiffC algorithm, the author should provide the model complexity of the proposed method (e.g., network parameters, FLOPs, encoding/decoding
1. The paper extends DiffC to stable diffusion, provides an open-source implementation, and accelerates RCC with CUDA. These contributions promote the practicality of the method and pave the way for further exploration. 2. The paper is well written and easy to follow. Enough background information is provided to general readers. 3. The paper provides guidance on potential research directions in the Future Work section, offering insights for subsequent studies.
1. The paper lacks sufficient innovation and resembles more of an engineering improvement on existing methods. 2. The diffusion model requires multiple inference steps for both encoding and decoding, resulting in significant computational overhead. 3. Although the authors claim that their method is competitive with other state-of-the-art compression methods, no comparisons are provided in the paper. To substantiate this claim, the authors should compare their method with existing approaches su
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Mathematical Modeling in Engineering
MethodsDiffusion
