Towards Faster Training of Diffusion Models: An Inspiration of A   Consistency Phenomenon

Tianshuo Xu; Peng Mi; Ruilin Wang; Yingcong Chen

arXiv:2404.07946·cs.LG·April 12, 2024·1 cites

Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon

Tianshuo Xu, Peng Mi, Ruilin Wang, Yingcong Chen

PDF

Open Access

TL;DR

This paper uncovers a consistency phenomenon in diffusion models that reveals their stability and uses this insight to develop strategies like curriculum learning and momentum decay, significantly speeding up training and enhancing image quality.

Contribution

The paper introduces a novel understanding of diffusion models' stability and proposes two innovative training acceleration strategies based on this insight.

Findings

01

Training time is significantly reduced with proposed strategies.

02

Generated image quality is improved through faster training.

03

Diffusion models exhibit high stability and similar outputs across different initializations.

Abstract

Diffusion models (DMs) are a powerful generative framework that have attracted significant attention in recent years. However, the high computational cost of training DMs limits their practical applications. In this paper, we start with a consistency phenomenon of DMs: we observe that DMs with different initializations or even different architectures can produce very similar outputs given the same noise inputs, which is rare in other generative models. We attribute this phenomenon to two factors: (1) the learning difficulty of DMs is lower when the noise-prediction diffusion model approaches the upper bound of the timestep (the input becomes pure noise), where the structural information of the output is usually generated; and (2) the loss landscape of DMs is highly smooth, which implies that the model tends to converge to similar local minima and exhibit similar behavior patterns. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion