Rethinking Timesteps Samplers and Prediction Types
Bin Xie, Gady Agam

TL;DR
This paper investigates the training challenges of diffusion models under limited resources, identifying key issues related to loss variation across timesteps and prediction types, and proposes mixed-prediction strategies to improve training efficiency.
Contribution
It introduces an analysis of training difficulties with limited resources and proposes mixed-prediction approaches to enhance diffusion model training.
Findings
Loss variation across timesteps disrupts training progress.
Different prediction types have varying effectiveness depending on the task.
Mixed-prediction strategies could improve training stability and efficiency.
Abstract
Diffusion models suffer from the huge consumption of time and resources to train. For example, diffusion models need hundreds of GPUs to train for several weeks for a high-resolution generative task to meet the requirements of an extremely large number of iterations and a large batch size. Training diffusion models become a millionaire's game. With limited resources that only fit a small batch size, training a diffusion model always fails. In this paper, we investigate the key reasons behind the difficulties of training diffusion models with limited resources. Through numerous experiments and demonstrations, we identified a major factor: the significant variation in the training losses across different timesteps, which can easily disrupt the progress made in previous iterations. Moreover, different prediction types of exhibit varying effectiveness depending on the task and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Business Process Modeling and Analysis · Data Quality and Management
MethodsDiffusion
