Sampling-based Model Predictive Control Using Trust Regions
Markus Walker, Marcel Reith-Braun, Daniel Frisch, Uwe D. Hanebeck

TL;DR
This paper introduces a trust region approach to sampling-based MPC, replacing heuristics with principled KL divergence bounds, improving convergence and sample efficiency especially with LCD-based sampling.
Contribution
It proposes a trust region formulation for sampling-based MPC that uses KL divergence bounds for hyperparameter updates, enhancing efficiency and convergence.
Findings
Faster convergence in benchmark environments.
Improved sample efficiency with trust region updates.
Enhanced performance with LCD-based sampling.
Abstract
Sampling-based model predictive control (MPC) algorithms, such as model predictive path integral (MPPI), enable approximate, gradient-free solutions to optimal control problems by drawing samples from a proposal distribution, evaluating their trajectory costs, and updating the proposal parameters accordingly. However, these approaches typically rely on heuristics for adjusting hyperparameters, such as temperature or momentum, or manual tuning. We propose a trust region formulation for sampling-based MPC that constrains updates of the proposal distribution via a principled Kullback--Leibler (KL) divergence bound and, optionally, an entropy lower bound. This replaces heuristic hyperparameter adaptation with values that are optimal w.r.t. the underlying Lagrangian. We further improve sample efficiency and convergence by combining the trust region update with deterministic localized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
