The tractability landscape of diffusion alignment: regularization, rewards, and computational primitives

Ankur Moitra; Andrej Risteski; Dhruv Rohatgi

arXiv:2605.11361·cs.LG·May 13, 2026

The tractability landscape of diffusion alignment: regularization, rewards, and computational primitives

Ankur Moitra, Andrej Risteski, Dhruv Rohatgi

PDF

TL;DR

This paper explores the computational primitives needed for reward alignment in diffusion models, analyzing how different distance measures like KL and Wasserstein influence the algorithms and reward classes that can be efficiently implemented.

Contribution

It introduces a primitive-based framework for reward alignment, characterizing the necessary algorithms for different distribution distances and reward types.

Findings

01

KL-based alignment uses exponential tilts that are efficiently sampleable.

02

Wasserstein-based alignment employs proximal transport oracles for certain reward classes.

03

Choice of distance measure determines the computational primitive and reward class tractability.

Abstract

Inference-time reward alignment asks how to turn a pre-trained diffusion model with base law $p$ into a sampler that favors a reward $r$ while remaining close to $p$ . Since there is no canonical distributional distance for this closeness constraint, different choices lead to different "reward-aligned" laws and, just as importantly, different algorithmic problems. We develop a primitive-based approach to reward alignment: rather than assuming arbitrary reward-aligned laws can be sampled, we ask which simple algorithmic primitives suffice to implement alignment for non-trivial reward classes. If closeness is measured in KL distance, the target law is $q (x) \propto p (x) exp (λ^{- 1} r (x))$ . For this setting, we show that linear exponential tilts of the form $q (x) \propto p (x) exp (⟨ θ, x ⟩)$ -- which according to recent work [MRR26] can be efficiently sampled from -- are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.