Controllable Generative Video Compression

Ding Ding; Daowen Li; Ying Chen; Yixin Gao; Ruixiao Dong; Kai Li; Li Li

arXiv:2604.06655·cs.CV·April 9, 2026

Controllable Generative Video Compression

Ding Ding, Daowen Li, Ying Chen, Yixin Gao, Ruixiao Dong, Kai Li, Li Li

PDF

TL;DR

This paper introduces CGVC, a new controllable generative video compression method that balances perceptual quality and signal fidelity by using keyframes and control priors.

Contribution

The paper proposes a novel CGVC paradigm that uses structural and dense control priors for faithful, controllable non-keyframe video reconstruction.

Findings

01

CGVC outperforms previous methods in perceptual quality.

02

CGVC achieves higher signal fidelity in reconstructed videos.

03

Color-distance-guided keyframe selection improves color accuracy.

Abstract

Perceptual video compression adopts generative video modeling to improve perceptual realism but frequently sacrifices signal fidelity, diverging from the goal of video compression to faithfully reproduce visual signal. To alleviate the dilemma between perception and fidelity, in this paper we propose Controllable Generative Video Compression (CGVC) paradigm to faithfully generate details guided by multiple visual conditions. Under the paradigm, representative keyframes of the scene are coded and used to provide structural priors for non-keyframe generation. Dense per-frame control prior is additionally coded to better preserve finer structure and semantics of each non-keyframe. Guided by these priors, non-keyframes are reconstructed by controllable video generation model with temporal and content consistency. Furthermore, to accurately recover color information of the video, we develop…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.