A unified perspective on fine-tuning and sampling with diffusion and flow models
Carles Domingo-Enrich, Yuanqi Du, Michael S. Albergo

TL;DR
This paper presents a unified theoretical framework for training diffusion and flow models using various approaches, providing insights into bias-variance trade-offs and new loss functions, validated through experiments on Stable Diffusion.
Contribution
It unifies different training perspectives for diffusion and flow models, introduces new theoretical insights, and proposes adapted loss functions with validation on real models.
Findings
Bias-variance analysis shows finite gradient variance for certain methods.
Norm bounds support the effectiveness of adjoint-based methods.
New loss functions and identities improve training stability and performance.
Abstract
We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This problem can be approached from a stochastic optimal control (SOC) perspective, using adjoint-based or score matching methods, or from a non-equilibrium thermodynamics perspective. We provide a unified framework encompassing these approaches and make three main contributions: (i) bias-variance decompositions revealing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching do not; (ii) norm bounds on the lean adjoint ODE that theoretically support the effectiveness of adjoint-based methods; and (iii) adaptations of the CMCD and NETS loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
