Diffusion Controller: Framework, Algorithms and Parameterization

Tong Yang; Moonkyung Ryu; Chih-Wei Hsu; Guy Tennenholtz; Yuejie Chi; Craig Boutilier; Bo Dai

arXiv:2603.06981·cs.LG·March 10, 2026

Diffusion Controller: Framework, Algorithms and Parameterization

Tong Yang, Moonkyung Ryu, Chih-Wei Hsu, Guy Tennenholtz, Yuejie Chi, Craig Boutilier, Bo Dai

PDF

Open Access

TL;DR

This paper introduces Diffusion Controller (DiffCon), a unified control-theoretic framework for diffusion models that improves fine-tuning and control by reweighting transition kernels within a stochastic control setting, leading to better alignment and efficiency.

Contribution

It presents a novel control-theoretic formulation of diffusion processes as LS-MDPs, deriving practical RL-based algorithms for diffusion fine-tuning and a model decomposition for effective gray-box adaptation.

Findings

01

Consistent improvements in preference-alignment win rates.

02

Enhanced quality-efficiency trade-offs over baselines.

03

Effective gray-box adaptation with a lightweight control correction.

Abstract

Controllable diffusion generation often relies on various heuristics that are seemingly disconnected without a unified understanding. We bridge this gap with Diffusion Controller (DiffCon), a unified control-theoretic view that casts reverse diffusion sampling as state-only stochastic control within (generalized) linearly-solvable Markov Decision Processes (LS-MDPs). Under this framework, control acts by reweighting the pretrained reverse-time transition kernels, balancing terminal objectives against an $f$ -divergence cost. From the resulting optimality conditions, we derive practical reinforcement learning methods for diffusion fine-tuning: (i) f-divergence-regularized policy-gradient updates, including a PPO-style rule, and (ii) a regularizer-determined reward-weighted regression objective with a minimizer-preservation guarantee under the Kullback-Leibler (KL) divergence. The LS-MDP…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning