Consistent inverse optimal control for discrete-time nonlinear stochastic systems
Ziliang Wang, Han Zhang, Axel Ringh

TL;DR
This paper introduces a consistent inverse optimal control method for discrete-time nonlinear stochastic systems, utilizing sum-of-squares optimization to recover cost functions from expert data with proven asymptotic and statistical consistency.
Contribution
It develops a novel convex sum-of-squares based IOC algorithm for nonlinear stochastic systems, ensuring consistency and robustness in recovering cost functions.
Findings
The proposed IOC method is asymptotically consistent.
Numerical experiments validate the theoretical guarantees.
The approach demonstrates robustness and generalizability.
Abstract
Inverse Optimal Control (IOC) seeks to recover an unknown cost from expert demonstrations, and it provides a systematic way of modeling experts' decision mechanisms while considering the prior information of the cost functions. Nevertheless, existing IOC methods have consistency issue with the estimator under noisy and nonlinear settings. In this paper, we consider a discrete-time nonlinear system with process noise, and it is controlled by an optimal policy that minimizes the expectation of a discounted cumulative cost function across an infinite time-horizon. In particular, the cost function takes the form of a linear combination of a priori known feature functions. In this setting, we first adopt Lasserre's reformulation of the forward problem with occupancy measure. Next, we propose the infinite dimensional IOC algorithm and further approximate it with Lagrange interpolating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization · Advanced Bandit Algorithms Research
