Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

Jiduan Wu; Anas Barakat; Ilyas Fatkhullin; Niao He

arXiv:2309.04272·eess.SY·August 19, 2025

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simplified nested zeroth-order algorithm for zero-sum linear quadratic games, achieving improved sample complexity and guaranteed last-iterate convergence in both deterministic and model-free settings.

Contribution

It presents the first global last-iterate linear convergence result for zero-sum LQ games and enhances sample efficiency with a novel nested ZO algorithm.

Findings

01

Achieves (1/)^2 sample complexity in the model-free setting.

02

Establishes the first last-iterate linear convergence for zero-sum LQ games.

03

Improves sample complexity by several orders of magnitude over previous methods.

Abstract

Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control and can be used (i)~as a dynamic game formulation for risk-sensitive or robust control and (ii)~as a benchmark setting for multi-agent reinforcement learning with two competing agents in continuous state-control spaces. In contrast to the well-studied single-agent linear quadratic regulator problem, zero-sum LQ games entail solving a challenging nonconvex-nonconcave min-max problem with an objective function that lacks coercivity. Recently, Zhang et al. showed that an~ $ϵ$ -Nash equilibrium (NE) of finite horizon zero-sum LQ games can be learned via nested model-free Natural Policy Gradient (NPG) algorithms with poly $(1/ ϵ)$ sample complexity. In this work, we propose a simpler nested Zeroth-Order (ZO) algorithm improving sample complexity by several orders of magnitude and guaranteeing convergence of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wujiduan/zero-sum-lq-games
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research