Inverse linear-quadratic nonzero-sum differential games
Emin Martirosyan, Ming Cao

TL;DR
This paper develops algorithms to infer unknown cost functions in linear-quadratic nonzero-sum differential games from observed trajectories, extending from model-based to model-free settings using inverse optimal control and reinforcement learning techniques.
Contribution
It introduces a novel model-free algorithm for inverse LQ differential games, combining inverse optimal control and reinforcement learning, with convergence and stability analysis.
Findings
Algorithms successfully recover cost function parameters from trajectories.
Proposed methods are effective in both known and unknown system matrix scenarios.
Simulation results demonstrate the algorithms' convergence and accuracy.
Abstract
This paper addresses the inverse problem for Linear-Quadratic (LQ) nonzero-sum -player differential games, where the goal is to learn parameters of an unknown cost function for the game, called observed, given the demonstrated trajectories that are known to be generated by stationary linear feedback Nash equilibrium laws. Towards this end, using the demonstrated data, a synthesized game needs to be constructed, which is required to be equivalent to the observed game in the sense that the trajectories generated by the equilibrium feedback laws of the players in the synthesized game are the same as those demonstrated trajectories. We show a model-based algorithm that can accomplish this task using the given trajectories. We then extend this model-based algorithm to a model-free setting to solve the same problem in the case when the system's matrices are unknown. The algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Adaptive Dynamic Programming Control
