GVCC: Zero-Shot Video Compression via Codebook-Driven Stochastic Rectified Flow
Ziyue Zeng, Xun Su, Haoyuan Liu, Bingyu Lu, Yui Tatsumi, Hiroshi Watanabe

TL;DR
GVCC introduces a zero-shot video compression framework that leverages a pretrained generative model and stochastic flow sampling to achieve high-fidelity reconstruction at ultra-low bitrates.
Contribution
It converts deterministic flow models into stochastic processes, enabling efficient information transmission through stochastic innovations in video coding.
Findings
GVCC achieves the lowest LPIPS among evaluated baselines at ultra-low bitrates.
GVCC reduces LPIPS by 65% over DCVC-RT at similar bitrate levels.
GVCC supports multiple practical modes including T2V, I2V, and FLF2V.
Abstract
At ultra-low bitrates, high-fidelity reconstruction requires sampling plausible videos from the posterior rather than regressing to oversmoothed conditional means. We propose Generative Video Codebook Codec (GVCC), a zero-shot framework in which a pretrained video generative model serves directly as the decoder, and the transmitted bitstream specifies its generation trajectory. Modern rectified-flow video models are typically sampled with deterministic ODE solvers, which leave no per-step stochastic channel for transmitting compressed information. GVCC addresses this by converting the deterministic flow sampler into an equivalent marginal-preserving stochastic process, so that information can be transmitted by encoding the per-step stochastic innovations. Unlike images, videos introduce longer temporal dependencies and more diverse conditioning modes. We instantiate GVCC in three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
