Data-Driven Inverse Reinforcement Learning of Linear Systems with Model Uncertainty: A Convex Optimization View

Duc Cuong Nguyen; Phuong Nam Dao

arXiv:2605.09164·eess.SY·May 12, 2026

Data-Driven Inverse Reinforcement Learning of Linear Systems with Model Uncertainty: A Convex Optimization View

Duc Cuong Nguyen, Phuong Nam Dao

PDF

TL;DR

This paper introduces a convex optimization framework for data-driven inverse reinforcement learning of linear systems with model uncertainty, improving robustness and computational simplicity over traditional methods.

Contribution

It develops a convex, model-free IRL method for uncertain linear systems using semidefinite programming and stochastic approximation, with enhanced robustness.

Findings

01

Accurate recovery of expert behavior in power-system simulations

02

Improved robustness to gain-estimation errors and model mismatch

03

Simpler computational pipeline than classical iterative schemes

Abstract

Inverse reinforcement learning (IRL) for linear systems seeks a cost function whose optimal controller reproduces an expert policy from data. Existing data-driven methods for discrete-time linear systems are largely built on iterative policy/value updates, repeated matrix inversions, and, in some cases, an initial stabilizing controller, which can limit numerical robustness and practical applicability. This paper develops a convex-optimization framework for data-driven inverse reinforcement learning of discrete-time linear systems with model uncertainty. For nominal systems, we derive a semidefinite characterization of inverse optimality and a relaxed formulation that recovers an equivalent state-cost matrix together with a stabilizing controller from expert trajectories. We then obtain a model-free, off-policy reformulation by replacing the unknown system matrices with a regressed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.