Semantics-Guided Generative Image Compression

Cheng-Lin Wu; Hyomin Choi; Ivan V. Baji\'c

arXiv:2505.24015·eess.IV·June 2, 2025

Semantics-Guided Generative Image Compression

Cheng-Lin Wu, Hyomin Choi, Ivan V. Baji\'c

PDF

Open Access 1 Repo

TL;DR

This paper enhances multimodal image semantic compression by introducing semantic segmentation guidance and content-adaptive diffusion, significantly improving image quality and reducing encoding/decoding complexity at low bit rates.

Contribution

It proposes novel semantic segmentation guidance and content-adaptive diffusion components that improve image quality and efficiency in multimodal image semantic compression.

Findings

01

Improved PSNR and perceptual metrics over baseline MISC

02

Reduced encoding and decoding time by over 36%

03

Outperforms mainstream codecs in perceptual quality

Abstract

Advancements in text-to-image generative AI with large multimodal models are spreading into the field of image compression, creating high-quality representation of images at extremely low bit rates. This work introduces novel components to the existing multimodal image semantic compression (MISC) approach, enhancing the quality of the generated images in terms of PSNR and perceptual metrics. The new components include semantic segmentation guidance for the generative decoder, as well as content-adaptive diffusion, which controls the number of diffusion steps based on image characteristics. The results show that our newly introduced methods significantly improve the baseline MISC model while also decreasing the complexity. As a result, both the encoding and decoding time are reduced by more than 36%. Moreover, the proposed compression framework outperforms mainstream codecs in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CrashedBboy/SGGIC
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Data Compression Techniques

MethodsDiffusion