An Online Learning Approach for Two-Player Zero-Sum Linear Quadratic Games

Shanting Wang; Weihao Sun; Andreas A. Malikopoulos

arXiv:2604.02619·eess.SY·April 6, 2026

An Online Learning Approach for Two-Player Zero-Sum Linear Quadratic Games

Shanting Wang, Weihao Sun, Andreas A. Malikopoulos

PDF

TL;DR

This paper introduces an online learning framework for two-player zero-sum linear quadratic games with unknown dynamics, combining model estimation, confidence sets, and surrogate models to ensure convergence and stability.

Contribution

It proposes a novel approach integrating regularized least squares, confidence sets, and surrogate model selection for policy updates in unknown dynamic environments.

Findings

01

The algorithm converges with provable regret bounds.

02

Numerical experiments confirm the theoretical analysis.

03

The method effectively stabilizes the saddle point solutions.

Abstract

In this paper, we present an online learning approach for two-player zero-sum linear quadratic games with unknown dynamics. We develop a framework combining regularized least squares model estimation, high probability confidence sets, and surrogate model selection to maintain a regular model for policy updates. We apply a shrinkage step at each episode to identify a surrogate model in the region where the generalized algebraic Riccati equation admits a stabilizing saddle point solution. We then establish regret analysis on algorithm convergence, followed by a numerical example to illustrate the convergence performance and verify the regret analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.