Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion
Amirhosein Javadi, Shirin Saeedi Bidokhti, Tara Javidi

TL;DR
This paper introduces ActDiff-VC, a diffusion-based video compression method that uses sparse conditioning signals for ultra-low-bitrate, perceptually realistic video reconstruction, outperforming existing codecs in rate-distortion trade-offs.
Contribution
The work presents a novel diffusion-based framework with content-adaptive keyframe and trajectory selection for ultra-low-bitrate video compression.
Findings
Achieves up to 64.6% bitrate reduction at matched NIQE.
Improves KID and FID scores significantly at comparable bitrates.
Provides favorable perceptual rate--distortion trade-offs in experiments.
Abstract
Diffusion models provide a powerful generative prior for perceptual reconstruction at ultra-low bitrates, but effective video compression requires controlling the generative process using highly compact conditioning signals. In this work, we present ActDiff-VC, a diffusion-based video compression framework for the ultra-low-bitrate regime. Our method partitions videos into variable-length segments, transmits keyframes only when needed, and summarizes temporal dynamics using a compact set of tracked point trajectories. Conditioned on these sparse signals, a conditional diffusion decoder synthesizes the remaining frames, enabling perceptually realistic reconstruction under severe rate constraints. To support this design, we introduce two mechanisms: content-adaptive keyframe selection and budget-aware sparse trajectory selection, which together enable compact yet effective conditioning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
