{\epsilon}-Optimally Solving Two-Player Zero-Sum POSGs

Erwan Christian Escudie; Matthia Sabatelli; Olivier Buffet; Jilles Steeve Dibangoye

arXiv:2511.11282·cs.GT·November 17, 2025

{\epsilon}-Optimally Solving Two-Player Zero-Sum POSGs

Erwan Christian Escudie, Matthia Sabatelli, Olivier Buffet, Jilles Steeve Dibangoye

PDF

Open Access

TL;DR

This paper introduces a lossless reduction from two-player zero-sum partially observable stochastic games to transition-independent zero-sum stochastic games, enabling the use of dynamic programming methods for solving complex game scenarios.

Contribution

It presents the first lossless reduction that preserves key properties, allowing principled application of dynamic programming to solve zs-POSGs.

Findings

01

PBVI algorithms produce {}-optimal strategies.

02

Reduction enables transfer of solution techniques from stochastic games.

03

Empirical results outperform existing methods.

Abstract

We present a novel framework for {\epsilon}-optimally solving two-player zero-sum partially observable stochastic games (zs-POSGs). These games pose a major challenge due to the absence of a principled connection with dynamic programming (DP) techniques developed for two-player zero-sum stochastic games (zs-SGs). Prior attempts at transferring solution methods have lacked a lossless reduction, defined here as a transformation that preserves value functions, equilibrium strategies, and optimality structure, thereby limiting generalisation to ad-hoc algorithms. This work introduces the first lossless reduction from zs-POSGs to transition-independent zs-SGs, enabling the principled application of a broad class of DP-based methods. We show empirically that point-based value iteration (PBVI) algorithms, applied via this reduction, produce {\epsilon}-optimal strategies across a range of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Game Theory and Voting Systems · Risk and Portfolio Optimization