Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation
Yuyang You, Yongzhi Li, Jiahui Li, Yadong Mu, Quan Chen, Peng Jiang

TL;DR
This paper introduces a specialized distillation framework for video diffusion models that reduces artifacts, mitigates temporal collapse, and improves efficiency in few-step video generation, outperforming existing methods.
Contribution
The paper presents a novel video-specific distillation approach with adaptive loss functions and inference strategies, addressing oversaturation and temporal collapse issues in video diffusion models.
Findings
Achieves stable few-step video synthesis with high perceptual quality.
Significantly outperforms existing distillation baselines on benchmarks.
Reduces sampling overhead while maintaining motion realism.
Abstract
Video generation has recently emerged as a central task in the field of generative AI. However, the substantial computational cost inherent in video synthesis makes model distillation a critical technique for efficient deployment. Despite its significance, there is a scarcity of methods specifically designed for video diffusion models. Prevailing approaches often directly adapt image distillation techniques, which frequently lead to artifacts such as oversaturation, temporal inconsistency, and mode collapse. To address these challenges, we propose a novel distillation framework tailored specifically for video diffusion models. Its core innovations include: (1) an adaptive regression loss that dynamically adjusts spatial supervision weights to prevent artifacts arising from excessive distribution shifts; (2) a temporal regularization loss to counteract temporal collapse, promoting smooth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Human Motion and Animation
