GPD: Guided Progressive Distillation for Fast and High-Quality Video Generation

Xiao Liang; Yunzhu Zhang; Linchao Zhu

arXiv:2602.01814·cs.CV·February 3, 2026

GPD: Guided Progressive Distillation for Fast and High-Quality Video Generation

Xiao Liang, Yunzhu Zhang, Linchao Zhu

PDF

Open Access

TL;DR

GPD is a novel framework that significantly accelerates video diffusion models by guiding a student model with a teacher, reducing steps from 48 to 6 while maintaining high quality.

Contribution

GPD introduces a progressive distillation training strategy with online targets and frequency constraints, enabling fast, high-quality video generation with fewer diffusion steps.

Findings

01

Reduces diffusion steps from 48 to 6

02

Maintains competitive visual quality on VBench

03

Outperforms existing distillation methods in simplicity and quality

Abstract

Diffusion models have achieved remarkable success in video generation; however, the high computational cost of the denoising process remains a major bottleneck. Existing approaches have shown promise in reducing the number of diffusion steps, but they often suffer from significant quality degradation when applied to video generation. We propose Guided Progressive Distillation (GPD), a framework that accelerates the diffusion process for fast and high-quality video generation. GPD introduces a novel training strategy in which a teacher model progressively guides a student model to operate with larger step sizes. The framework consists of two key components: (1) an online-generated training target that reduces optimization difficulty while improving computational efficiency, and (2) frequency-domain constraints in the latent space that promote the preservation of fine-grained details and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Image Enhancement Techniques