GPD: Guided Progressive Distillation for Fast and High-Quality Video Generation
Xiao Liang, Yunzhu Zhang, Linchao Zhu

TL;DR
GPD is a novel framework that significantly accelerates video diffusion models by guiding a student model with a teacher, reducing steps from 48 to 6 while maintaining high quality.
Contribution
GPD introduces a progressive distillation training strategy with online targets and frequency constraints, enabling fast, high-quality video generation with fewer diffusion steps.
Findings
Reduces diffusion steps from 48 to 6
Maintains competitive visual quality on VBench
Outperforms existing distillation methods in simplicity and quality
Abstract
Diffusion models have achieved remarkable success in video generation; however, the high computational cost of the denoising process remains a major bottleneck. Existing approaches have shown promise in reducing the number of diffusion steps, but they often suffer from significant quality degradation when applied to video generation. We propose Guided Progressive Distillation (GPD), a framework that accelerates the diffusion process for fast and high-quality video generation. GPD introduces a novel training strategy in which a teacher model progressively guides a student model to operate with larger step sizes. The framework consists of two key components: (1) an online-generated training target that reduces optimization difficulty while improving computational efficiency, and (2) frequency-domain constraints in the latent space that promote the preservation of fine-grained details and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Image Enhancement Techniques
