Direct Distributional Optimization for Provable Alignment of Diffusion   Models

Ryotaro Kawata; Kazusato Oko; Atsushi Nitanda; Taiji Suzuki

arXiv:2502.02954·cs.LG·March 7, 2025

Direct Distributional Optimization for Provable Alignment of Diffusion Models

Ryotaro Kawata, Kazusato Oko, Atsushi Nitanda, Taiji Suzuki

PDF

Open Access

TL;DR

This paper presents a new distribution optimization approach for diffusion models that guarantees convergence and improves sampling efficiency, with broad applicability including reinforcement learning and preference optimization.

Contribution

It introduces a distribution optimization framework with convergence guarantees and a novel sampling method for diffusion models, applicable to various alignment tasks.

Findings

01

Proven convergence guarantees for the proposed method.

02

Efficient sampling with end-to-end error bounds.

03

Validated performance on synthetic and image datasets.

Abstract

We introduce a novel alignment method for diffusion models from distribution optimization perspectives while providing rigorous convergence guarantees. We first formulate the problem as a generic regularized loss minimization over probability distributions and directly optimize the distribution using the Dual Averaging method. Next, we enable sampling from the learned distribution by approximating its score function via Doob's $h$ -transform technique. The proposed framework is supported by rigorous convergence guarantees and an end-to-end bound on the sampling error, which imply that when the original distribution's score is known accurately, the complexity of sampling from shifted distributions is independent of isoperimetric conditions. This framework is broadly applicable to general distribution optimization problems, including alignment tasks in Reinforcement Learning with Human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopology Optimization in Engineering

MethodsDirect Preference Optimization · Diffusion