Spectral Progressive Diffusion for Efficient Image and Video Generation
Howard Xiao, Brian Chao, Lior Yariv, Gordon Wetzstein

TL;DR
This paper introduces Spectral Progressive Diffusion, a framework that accelerates image and video generation by progressively increasing resolution during denoising, leveraging spectral properties of diffusion models.
Contribution
It presents a spectral noise expansion mechanism and an optimal resolution schedule, enabling training-free acceleration and improved efficiency in pretrained diffusion models.
Findings
Significant speedups achieved on state-of-the-art models
Preservation of visual quality with increased efficiency
Supports both image and video generation tasks
Abstract
Diffusion models have been shown to implicitly generate visual content autoregressively in the frequency domain, where low-frequency components are generated earlier in the denoising process while high-frequency details emerge only in later timesteps. This structure offers a natural opportunity for efficient generation, as high-resolution computation on noise-dominated frequencies is largely redundant. We propose Spectral Progressive Diffusion, a general framework that progressively grows resolution along the denoising trajectory of pretrained diffusion models. To this end, we develop a spectral noise expansion mechanism and derive an optimal resolution schedule from the model's power spectrum. Our framework supports training-free acceleration and a novel fine-tuning recipe that further improves efficiency and quality. We demonstrate significant speedups on state-of-the-art pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
