PerCo (SD): Open Perceptual Compression
Nikolai K\"orber, Eduard Kromer, Andreas Siebert, Sascha, Hauke, Daniel Mueller-Gritschneder, Bj\"orn Schuller

TL;DR
PerCo (SD) introduces an open perceptual image compression method based on Stable Diffusion v2.1, offering a competitive alternative to proprietary methods by emphasizing perceptual quality at ultra-low bit rates.
Contribution
This work adapts PerCo to the Stable Diffusion ecosystem, providing an open-source, comprehensive comparison, and analyzing its perceptual advantages and limitations.
Findings
PerCo (SD) shows improved perceptual quality at higher distortion levels.
The model's capacity differences partly explain the perceptual-distortion trade-off.
Open-source release facilitates further research and development in perceptual compression.
Abstract
We introduce PerCo (SD), a perceptual image compression method based on Stable Diffusion v2.1, targeting the ultra-low bit range. PerCo (SD) serves as an open and competitive alternative to the state-of-the-art method PerCo, which relies on a proprietary variant of GLIDE and remains closed to the public. In this work, we review the theoretical foundations, discuss key engineering decisions in adapting PerCo to the Stable Diffusion ecosystem, and provide a comprehensive comparison, both quantitatively and qualitatively. On the MSCOCO-30k dataset, PerCo (SD) demonstrates improved perceptual characteristics at the cost of higher distortion. We partly attribute this gap to the different model capacities being used (866M vs. 1.4B). We hope our work contributes to a deeper understanding of the underlying mechanisms and paves the way for future advancements in the field. Code and trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Computer Graphics and Visualization Techniques
MethodsGuided Language to Image Diffusion for Generation and Editing · Diffusion
