Direct Distributional Optimization for Provable Alignment of Diffusion Models
Ryotaro Kawata, Kazusato Oko, Atsushi Nitanda, Taiji Suzuki

TL;DR
This paper presents a new distribution optimization approach for diffusion models that guarantees convergence and improves sampling efficiency, with broad applicability including reinforcement learning and preference optimization.
Contribution
It introduces a distribution optimization framework with convergence guarantees and a novel sampling method for diffusion models, applicable to various alignment tasks.
Findings
Proven convergence guarantees for the proposed method.
Efficient sampling with end-to-end error bounds.
Validated performance on synthetic and image datasets.
Abstract
We introduce a novel alignment method for diffusion models from distribution optimization perspectives while providing rigorous convergence guarantees. We first formulate the problem as a generic regularized loss minimization over probability distributions and directly optimize the distribution using the Dual Averaging method. Next, we enable sampling from the learned distribution by approximating its score function via Doob's -transform technique. The proposed framework is supported by rigorous convergence guarantees and an end-to-end bound on the sampling error, which imply that when the original distribution's score is known accurately, the complexity of sampling from shifted distributions is independent of isoperimetric conditions. This framework is broadly applicable to general distribution optimization problems, including alignment tasks in Reinforcement Learning with Human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopology Optimization in Engineering
MethodsDirect Preference Optimization · Diffusion
