TL;DR
This paper introduces Internal Guidance (IG), a simple strategy that improves diffusion model training and sampling by auxiliary supervision, leading to state-of-the-art image generation quality.
Contribution
The paper proposes Internal Guidance, a novel auxiliary supervision method during training that enhances diffusion models' efficiency and output quality.
Findings
IG significantly improves FID scores on ImageNet benchmarks.
Combining IG with classifier free guidance achieves new state-of-the-art results.
IG enhances training efficiency and sample quality across various diffusion model baselines.
Abstract
The diffusion model presents a powerful ability to capture the entire (conditional) data distribution. However, due to the lack of sufficient training and data to learn to cover low-probability areas, the model will be penalized for failing to generate high-quality images corresponding to these areas. To achieve better generation quality, guidance strategies such as classifier free guidance (CFG) can guide the samples to the high-probability areas during the sampling stage. However, the standard CFG often leads to over-simplified or distorted samples. On the other hand, the alternative line of guiding diffusion model with its bad version is limited by carefully designed degradation strategies, extra training and additional sampling steps. In this paper, we proposed a simple yet effective strategy Internal Guidance (IG), which introduces an auxiliary supervision on the intermediate layer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
