Demystifying Diffusion Objectives: Reweighted Losses are Better Variational Bounds
Jiaxin Shi, Michalis K. Titsias

TL;DR
This paper introduces a theoretical framework for reweighted diffusion losses, demonstrating they serve as improved variational bounds that enhance model training and sample quality in both continuous and discrete diffusion models.
Contribution
The authors develop a cascade of variational bounds that justify reweighted losses, applicable to various diffusion models, and show empirical improvements in image modeling tasks.
Findings
Reweighted losses provide tighter variational bounds than standard ELBO.
Significant performance gains in masked diffusion models for image generation.
Theoretical justification for common weighting schemes in masked image models.
Abstract
We derive a new theoretical interpretation of the reweighted losses that are widely used for training diffusion models. Our method is based on constructing a cascade of time-dependent variational lower bounds on the data log-likelihood, that provably improves upon the standard evidence lower bound and results in reduced data-model KL-divergences. Combining such bounds gives rise to reweighted objectives that can be applied to any generative diffusion model including both continuous Gaussian diffusion and masked (discrete) diffusion models. Then, we showcase this framework in masked diffusion and report significant improvements over previous training losses in pixel-space image modeling, approaching sample quality comparable to continuous diffusion models. Our results also provide a theoretical justification for the simple weighting scheme widely used in masked image models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques
