Adjoint Matching through the Lens of the Stochastic Maximum Principle in Optimal Control
Carles Domingo-Enrich, Jiequn Han

TL;DR
This paper rigorously derives and generalizes Adjoint Matching for stochastic optimal control problems, connecting it to the Stochastic Maximum Principle and providing practical algorithms for control learning.
Contribution
It formulates a general Hamiltonian adjoint matching objective, shows its equivalence to the original SOC problem, and interprets it as a continuous-time approximation method.
Findings
Recovered the lean adjoint matching loss for state- and control-independent diffusion.
Demonstrated that critical points satisfy HJB stationarity conditions.
Provided a practical alternative to classical SMP algorithms avoiding intractable martingale terms.
Abstract
Reward fine-tuning of diffusion and flow models and sampling from tilted or Boltzmann distributions can both be formulated as stochastic optimal control (SOC) problems, where learning an optimal generative dynamics corresponds to optimizing a control under SDE constraints. In this work, we revisit and generalize Adjoint Matching, a recently proposed SOC-based method for learning optimal controls, and place it on a rigorous footing by deriving it from the Stochastic Maximum Principle (SMP). We formulate a general Hamiltonian adjoint matching objective for SOC problems with control-dependent drift and diffusion and convex running costs, and show that its expected value has the same first variation as the original SOC objective. As a consequence, critical points satisfy the Hamilton--Jacobi--Bellman (HJB) stationarity conditions. In the important practical case of state- and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
