Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression

Linfeng Qi; Zhaoyang Jia; Jiahao Li; Bin Li; Houqiang Li; Yan Lu

arXiv:2505.16177·eess.IV·May 23, 2025

Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression

Linfeng Qi, Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, Yan Lu

PDF

Open Access

TL;DR

This paper introduces GLC, a novel generative latent coding approach for ultra-low bitrate image and video compression that leverages latent space transform coding to improve perceptual quality and efficiency.

Contribution

The paper proposes GLC models using VQ-VAE latent space for better perceptual alignment, with enhancements like hyper modules and semantic loss, achieving state-of-the-art ultra-low bitrate compression.

Findings

01

GLC-image achieves <0.04 bpp with high FID

02

GLC-video saves 65.3% bitrate over PLVC

03

Enhanced hyper modules improve compression quality

Abstract

Most existing approaches for image and video compression perform transform coding in the pixel space to reduce redundancy. However, due to the misalignment between the pixel-space distortion and human perception, such schemes often face the difficulties in achieving both high-realism and high-fidelity at ultra-low bitrate. To solve this problem, we propose \textbf{G}enerative \textbf{L}atent \textbf{C}oding (\textbf{GLC}) models for image and video compression, termed GLC-image and GLC-Video. The transform coding of GLC is conducted in the latent space of a generative vector-quantized variational auto-encoder (VQ-VAE). Compared to the pixel-space, such a latent space offers greater sparsity, richer semantics and better alignment with human perception, and show its advantages in achieving high-realism and high-fidelity compression. To further enhance performance, we improve the hyper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis