Challenges in Model Agnostic Controller Learning for Unstable Systems

Mario Sznaier; Mustafa Bozdag

arXiv:2505.11641·math.OC·May 20, 2025·IEEE Control. Syst. Lett.

Challenges in Model Agnostic Controller Learning for Unstable Systems

Mario Sznaier, Mustafa Bozdag

PDF

Open Access

TL;DR

This paper examines the limitations of model-agnostic controller learning via direct policy optimization, highlighting stability issues and proposing alternative approaches to ensure reliable control in unstable systems.

Contribution

It provides a theoretical analysis showing the instability risks of direct policy optimization and explores new methods to mitigate these issues.

Findings

01

Direct policy optimization can cause unstable pole-zero cancellations.

02

Unbounded outputs may occur due to internal instability.

03

Alternative strategies can prevent stability loss.

Abstract

Model agnostic controller learning, for instance by direct policy optimization, has been the object of renewed attention lately, since it avoids a computationally expensive system identification step. Indeed, direct policy search has been empirically shown to lead to optimal controllers in a number of cases of practical importance. However, to date, these empirical results have not been backed up with a comprehensive theoretical analysis for general problems. In this paper we use a simple example to show that direct policy optimization is not directly generalizable to other seemingly simple problems. In such cases, direct optimization of a performance index can lead to unstable pole/zero cancellations, resulting in the loss of internal stability and unbounded outputs in response to arbitrarily small perturbations. We conclude the paper by analyzing several alternatives to avoid this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Model Reduction and Neural Networks · Advanced Control Systems Optimization