Receding-Horizon Policy Gradient for Polytopic Controller Synthesis

Shiva Shakeri; P\'eter Baranyi; and Mehran Mesbahi

arXiv:2603.29283·eess.SY·April 1, 2026

Receding-Horizon Policy Gradient for Polytopic Controller Synthesis

Shiva Shakeri, P\'eter Baranyi, and Mehran Mesbahi

PDF

TL;DR

This paper introduces the P-RHPG algorithm for controller synthesis that guarantees convergence and near-optimal performance, overcoming conservativeness of traditional LMI-based methods.

Contribution

The paper develops a novel receding-horizon policy gradient method with strong convexity guarantees for polytopic controller synthesis.

Findings

01

Converges to a unique infinite-horizon optimum.

02

Achieves near-optimal performance compared to Riccati bounds.

03

Guarantees linear convergence from any initialization.

Abstract

We propose the Polytopic Receding-Horizon Policy Gradient (P-RHPG) algorithm for synthesizing Parallel Distributed Compensation (PDC) controllers via Tensor Product (TP) model transformation. Standard LMI-based PDC synthesis grows increasingly conservative as model fidelity improves; P-RHPG instead solves a finite-horizon integrated cost via backward-stage decomposition. The key result is that each stage subproblem is a strongly convex quadratic in the vertex gains, a consequence of the linear independence of the HOSVD weighting functions, guaranteeing a unique global minimizer and linear convergence of gradient descent from any initialization. With zero terminal cost, the optimal cost increases monotonically to a finite limit and the gain sequence remains bounded; terminal costs satisfying a mild Lyapunov condition yield non-increasing convergence. Experiments on an aeroelastic wing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.