Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye, Zhiyang Chen, Tiancheng Li, Zemin Huang, Weijian Luo and, Guo-Jun Qi

TL;DR
This paper introduces TPDM, an adaptive diffusion scheduling method that predicts optimal denoising steps on the fly, improving image quality and efficiency in text-to-image generation.
Contribution
The paper proposes a plug-and-play Time Prediction Module trained with reinforcement learning to adaptively determine diffusion steps, enhancing image quality and reducing computation.
Findings
Achieves higher aesthetic and human preference scores.
Uses around 50% fewer denoising steps.
Improves image quality and efficiency in diffusion models.
Abstract
Diffusion and flow matching models have achieved remarkable success in text-to-image generation. However, these models typically rely on the predetermined denoising schedules for all prompts. The multi-step reverse diffusion process can be regarded as a kind of chain-of-thought for generating high-quality images step by step. Therefore, diffusion models should reason for each instance to adaptively determine the optimal noise schedule, achieving high generation quality with sampling efficiency. In this paper, we introduce the Time Prediction Diffusion Model (TPDM) for this. TPDM employs a plug-and-play Time Prediction Module (TPM) that predicts the next noise level based on current latent features at each denoising step. We train the TPM using reinforcement learning to maximize a reward that encourages high final image quality while penalizing excessive denoising steps. With such an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion
