A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion   Model Training

Kai Wang; Mingjia Shi; Yukun Zhou; Zekai Li; Zhihang Yuan; Yuzhang; Shang; Xiaojiang Peng; Hanwang Zhang; Yang You

arXiv:2405.17403·cs.LG·March 26, 2025·1 cites

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Kai Wang, Mingjia Shi, Yukun Zhou, Zekai Li, Zhihang Yuan, Yuzhang, Shang, Xiaojiang Peng, Hanwang Zhang, Yang You

PDF

Open Access 2 Repos

TL;DR

This paper presents SpeeD, a simple, architecture-agnostic method that accelerates diffusion model training by focusing on the importance of different time steps, achieving threefold speed-up with minimal overhead.

Contribution

The paper introduces a novel asymmetric sampling and weighting strategy based on time step analysis, significantly improving training efficiency for diffusion models.

Findings

01

Achieves 3x acceleration across various architectures and datasets

02

Identifies imbalance in time step importance during diffusion training

03

Reduces training costs with minimal additional overhead

Abstract

Training diffusion models is always a computation-intensive task. In this paper, we introduce a novel speed-up method for diffusion model training, called, which is based on a closer look at time steps. Our key findings are: i) Time steps can be empirically divided into acceleration, deceleration, and convergence areas based on the process increment. ii) These time steps are imbalanced, with many concentrated in the convergence area. iii) The concentrated steps provide limited benefits for diffusion training. To address this, we design an asymmetric sampling strategy that reduces the frequency of steps from the convergence area while increasing the sampling probability for steps from other areas. Additionally, we propose a weighting strategy to emphasize the importance of time steps with rapid-change process increments. As a plug-and-play and architecture-agnostic approach, SpeeD…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion