Consistent inverse optimal control for discrete-time nonlinear stochastic systems

Ziliang Wang; Han Zhang; Axel Ringh

arXiv:2511.22579·math.OC·December 1, 2025

Consistent inverse optimal control for discrete-time nonlinear stochastic systems

Ziliang Wang, Han Zhang, Axel Ringh

PDF

Open Access

TL;DR

This paper introduces a consistent inverse optimal control method for discrete-time nonlinear stochastic systems, utilizing sum-of-squares optimization to recover cost functions from expert data with proven asymptotic and statistical consistency.

Contribution

It develops a novel convex sum-of-squares based IOC algorithm for nonlinear stochastic systems, ensuring consistency and robustness in recovering cost functions.

Findings

01

The proposed IOC method is asymptotically consistent.

02

Numerical experiments validate the theoretical guarantees.

03

The approach demonstrates robustness and generalizability.

Abstract

Inverse Optimal Control (IOC) seeks to recover an unknown cost from expert demonstrations, and it provides a systematic way of modeling experts' decision mechanisms while considering the prior information of the cost functions. Nevertheless, existing IOC methods have consistency issue with the estimator under noisy and nonlinear settings. In this paper, we consider a discrete-time nonlinear system with process noise, and it is controlled by an optimal policy that minimizes the expectation of a discounted cumulative cost function across an infinite time-horizon. In particular, the cost function takes the form of a linear combination of a priori known feature functions. In this setting, we first adopt Lasserre's reformulation of the forward problem with occupancy measure. Next, we propose the infinite dimensional IOC algorithm and further approximate it with Lagrange interpolating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization · Advanced Bandit Algorithms Research