Inverse linear-quadratic nonzero-sum differential games

Emin Martirosyan; Ming Cao

arXiv:2310.05631·math.OC·October 28, 2024·1 cites

Inverse linear-quadratic nonzero-sum differential games

Emin Martirosyan, Ming Cao

PDF

Open Access

TL;DR

This paper develops algorithms to infer unknown cost functions in linear-quadratic nonzero-sum differential games from observed trajectories, extending from model-based to model-free settings using inverse optimal control and reinforcement learning techniques.

Contribution

It introduces a novel model-free algorithm for inverse LQ differential games, combining inverse optimal control and reinforcement learning, with convergence and stability analysis.

Findings

01

Algorithms successfully recover cost function parameters from trajectories.

02

Proposed methods are effective in both known and unknown system matrix scenarios.

03

Simulation results demonstrate the algorithms' convergence and accuracy.

Abstract

This paper addresses the inverse problem for Linear-Quadratic (LQ) nonzero-sum $N$ -player differential games, where the goal is to learn parameters of an unknown cost function for the game, called observed, given the demonstrated trajectories that are known to be generated by stationary linear feedback Nash equilibrium laws. Towards this end, using the demonstrated data, a synthesized game needs to be constructed, which is required to be equivalent to the observed game in the sense that the trajectories generated by the equilibrium feedback laws of the $N$ players in the synthesized game are the same as those demonstrated trajectories. We show a model-based algorithm that can accomplish this task using the given trajectories. We then extend this model-based algorithm to a model-free setting to solve the same problem in the case when the system's matrices are unknown. The algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Adaptive Dynamic Programming Control