Disentanglement in T-space for Faster and Distributed Training of Diffusion Models with Fewer Latent-states
Samarth Gupta, Raghudeep Gadde, Rui Chen, Aleix M. Martinez

TL;DR
This paper demonstrates that diffusion models can be effectively trained with fewer latent states, even a single state, by careful noise schedule selection, leading to faster training and high-quality sampling.
Contribution
It introduces a method to train diffusion models with minimal latent states, including a fully disentangled single-state approach, reducing training complexity and time.
Findings
Models with fewer latent states match performance of larger models.
Disentangled single-state models generate high-quality samples.
Training speed improves by 4-6 times across datasets.
Abstract
We challenge a fundamental assumption of diffusion models, namely, that a large number of latent-states or time-steps is required for training so that the reverse generative process is close to a Gaussian. We first show that with careful selection of a noise schedule, diffusion models trained over a small number of latent states (i.e. ) match the performance of models trained over a much large number of latent states (). Second, we push this limit (on the minimum number of latent states required) to a single latent-state, which we refer to as complete disentanglement in T-space. We show that high quality samples can be easily generated by the disentangled model obtained by combining several independently trained single latent-state models. We provide extensive experiments to show that the proposed disentangled model provides 4-6 faster convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks
