Improved Noise Schedule for Diffusion Training
Tiankai Hang, Shuyang Gu, Xin Geng, Baining Guo

TL;DR
This paper introduces a novel noise schedule for diffusion models that improves training efficiency and performance by focusing on the critical transition point between signal and noise, demonstrated through empirical results on ImageNet.
Contribution
The paper proposes a new noise schedule based on importance sampling of log SNR, enhancing diffusion model training efficiency and accuracy over standard schedules.
Findings
Superiority of the proposed noise schedule over cosine schedule
Consistent benefits across different prediction targets
Improved training efficiency on ImageNet benchmark
Abstract
Diffusion models have emerged as the de facto choice for generating high-quality visual signals across various domains. However, training a single model to predict noise across various levels poses significant challenges, necessitating numerous iterations and incurring significant computational costs. Various approaches, such as loss weighting strategy design and architectural refinements, have been introduced to expedite convergence and improve model performance. In this study, we propose a novel approach to design the noise schedule for enhancing the training of diffusion models. Our key insight is that the importance sampling of the logarithm of the Signal-to-Noise ratio (), theoretically equivalent to a modified noise schedule, is particularly beneficial for training efficiency when increasing the sample frequency around . This strategic sampling…
Peer Reviews
Decision·Submitted to ICLR 2025
1. The paper proposes a noise schedule design, focusing computational resources on medium noise levels (log SNR = 0). It provides a perspective on optimizing diffusion model training, potentially paving the way for more efficient generative AI methods. 2. The authors conduct a thorough set of experiments, comparing various noise schedules and loss weighting strategies. The experimental setup is robust, testing the methods on multiple resolutions and prediction targets, and demonstrating Laplace
1. While the experiments focus on ImageNet and high-resolution image tasks, the paper does not explore how the proposed noise schedules would perform in other domains or more complex real-world scenarios which would be useful for understanding the broader applicability of the approach. 2. Figure/tables might not be fully convincing, and it would be great to explain if they are statistically significant. In Figure 2, we could see that as number of training iterations increase to 500k, the gap be
1. The paper presents a novel approach to design the noise schedule for enhancing the training of diffusion models. 2. The proposed method is effective and easy to apply. 3. The findings contribute to the ongoing efforts to optimize diffusion models, potentially paving the way for more efficient and effective training paradigms. 4. The paper is overall well-written and easy to follow.
1. The paper lacks a formal theoretical guarantee for the effectiveness of the proposed method. While the experimental results are promising, providing a theoretical foundation would strengthen the validity of the approach. It would be better to add more analysis on why Laplace schedule is working and why we should increase the sample frequency around $log SNR = 0$. 2. The experiments are only conducted on the ImageNet dataset with different resolutions. However, the generalizability of the prop
1. The proposed method enjoys clean formulation. 2. Judging from the experimental results provided, the proposed method shows consistent improvement over competitors in various scenarios. 3. The formulation to relate noise schedule to noise importance sampling is rather universal, implying potential future extension.
Generally speaking, while the proposed method has the potential to make great contributions, the presentation of the manuscript makes it hard for the community to learn valuable new knowledge from this work. More concretely: For Writing: 1. What is the most important key insight or takeaway for readers? While method details are extensive, there lacks a more general summary of the key insight, especially in the introduction section. 2. The authors should provide more background for noise schedu
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus · Diffusion
