YODA: Yet Another One-step Diffusion-based Video Compressor
Xingchen Li, Junzhe Zhang, Junqi Shi, Ming Lu, Zhan Ma

TL;DR
YODA introduces a novel one-step diffusion-based video compression method that leverages multiscale temporal features and a linear Diffusion Transformer, achieving state-of-the-art perceptual quality and outperforming existing methods.
Contribution
The paper proposes YODA, a new one-step diffusion model for video compression that incorporates temporal features and a linear Diffusion Transformer for improved spatial-temporal correlation exploitation.
Findings
YODA outperforms traditional and deep-learning baselines on perceptual metrics.
YODA achieves state-of-the-art results on LPIPS, DISTS, FID, and KID.
The source code will be publicly available.
Abstract
While one-step diffusion models have recently excelled in perceptual image compression, their application to video remains limited. Prior efforts typically rely on pretrained 2D autoencoders that generate per-frame latent representations independently, thereby neglecting temporal dependencies. We present YODA--Yet Another One-step Diffusion-based Video Compressor--which embeds multiscale features from temporal references for both latent generation and latent coding to better exploit spatial-temporal correlations for more compact representation, and employs a linear Diffusion Transformer (DiT) for efficient one-step denoising. YODA achieves state-of-the-art perceptual performance, consistently outperforming traditional and deep-learning baselines on LPIPS, DISTS, FID, and KID. Source code will be publicly available at https://github.com/NJUVISION/YODA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Advanced Data Compression Techniques
