Diffusion-aided Extreme Video Compression with Lightweight Semantics Guidance
Maojun Zhang, Haotian Wu, Richeng Jin, Deniz Gunduz, Krystian Mikolajczyk

TL;DR
This paper introduces a novel video compression method that combines semantic representations with diffusion models and motion characterization to achieve high-quality reconstruction at extremely low bit-rates.
Contribution
It presents a new framework integrating generative priors, semantic compression, and motion modeling to drastically improve low-bit-rate video reconstruction.
Findings
Enables high-fidelity video reconstruction at extremely low bit-rates.
Uses semantic representations and diffusion models for efficient compression.
Characterizes motion with camera trajectories and segmentation masks.
Abstract
Modern video codecs and learning-based approaches struggle for semantic reconstruction at extremely low bit-rates due to reliance on low-level spatiotemporal redundancies. Generative models, especially diffusion models, offer a new paradigm for video compression by leveraging high-level semantic understanding and powerful visual synthesis. This paper propose a video compression framework that integrates generative priors to drastically reduce bit-rate while maintaining reconstruction fidelity. Specifically, our method compresses high-level semantic representations of the video, then uses a conditional diffusion model to reconstruct frames from these semantics. To further improve compression, we characterize motion information with global camera trajectories and foreground segmentation: background motion is compactly represented by camera pose parameters while foreground dynamics by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis
