Computationally efficient Gauss-Newton reinforcement learning for model predictive control
Dean Brandner, Sebastien Gros, and Sergio Lucia

TL;DR
This paper introduces a Gauss-Newton approximation for reinforcement learning in model predictive control, achieving faster convergence and better data efficiency without requiring second-order derivatives.
Contribution
It proposes a novel Gauss-Newton method for RL in MPC that reduces computational complexity and enhances robustness with momentum-based Hessian averaging.
Findings
Faster convergence compared to first-order methods.
Improved data efficiency over deep RL approaches.
Effective on nonlinear CSTR control problem.
Abstract
Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but converge at most linearly, making them inefficient when each policy update requires solving an optimal control problem, as is the case with MPC. While MPC policies are typically low parameterized and thus amenable to second-order approaches, existing second-order methods demand second-order policy derivatives, which can be computationally intractable. This work introduces a Gauss-Newton approximation of the deterministic policy Hessian that eliminates the need for second-order policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
