Policy Optimization with Differentiable MPC: Convergence Analysis under Uncertainty

Riccardo Zuliani; Efe C. Balta; John Lygeros

arXiv:2601.01940·eess.SY·April 15, 2026

Policy Optimization with Differentiable MPC: Convergence Analysis under Uncertainty

Riccardo Zuliani, Efe C. Balta, John Lygeros

PDF

TL;DR

This paper demonstrates that integrating gradient-based policy optimization with recursive system identification guarantees convergence to optimal controllers in model-based policy optimization, especially when using differentiable MPC under uncertainty.

Contribution

It introduces a method combining policy optimization with system identification to ensure convergence in differentiable MPC, addressing model accuracy issues.

Findings

01

Guarantees convergence to optimal controllers.

02

Effective in various control examples.

03

Addresses model uncertainty in policy optimization.

Abstract

Model-based policy optimization is a well-established framework for designing reliable and high-performance controllers across a wide range of control applications. Recently, this approach has been extended to model predictive control policies, where explicit dynamical models are embedded within the control law. However, the performance of the resulting controllers, and the convergence of the associated optimization algorithms, critically depends on the accuracy of the models. In this paper, we demonstrate that combining gradient-based policy optimization with recursive system identification ensures convergence to an optimal controller design and showcase our finding in several control examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.