Unifying Model Predictive Path Integral Control, Reinforcement Learning, and Diffusion Models for Optimal Control and Planning
Yankai Li, Mo Chen

TL;DR
This paper unifies MPPI control, Reinforcement Learning, and Diffusion Models through a common gradient-based optimization framework, revealing their underlying connections and providing new insights into their similarities and differences.
Contribution
It introduces a unified perspective linking MPPI, RL, and Diffusion Models via gradient-based optimization on the Gibbs measure, clarifying their relationships.
Findings
MPPI is gradient ascent on a smoothed energy function.
Policy Gradient methods are equivalent to MPPI with an exponential transformation.
Diffusion model sampling follows the same update rule as MPPI.
Abstract
Model Predictive Path Integral (MPPI) control, Reinforcement Learning (RL), and Diffusion Models have each demonstrated strong performance in trajectory optimization, decision-making, and motion planning. However, these approaches have traditionally been treated as distinct methodologies with separate optimization frameworks. In this work, we establish a unified perspective that connects MPPI, RL, and Diffusion Models through gradient-based optimization on the Gibbs measure. We first show that MPPI can be interpreted as performing gradient ascent on a smoothed energy function. We then demonstrate that Policy Gradient methods reduce to MPPI by applying an exponential transformation to the objective function. Additionally, we establish that the reverse sampling process in diffusion models follows the same update rule as MPPI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control
