{\epsilon}-Optimally Solving Two-Player Zero-Sum POSGs
Erwan Christian Escudie, Matthia Sabatelli, Olivier Buffet, Jilles Steeve Dibangoye

TL;DR
This paper introduces a lossless reduction from two-player zero-sum partially observable stochastic games to transition-independent zero-sum stochastic games, enabling the use of dynamic programming methods for solving complex game scenarios.
Contribution
It presents the first lossless reduction that preserves key properties, allowing principled application of dynamic programming to solve zs-POSGs.
Findings
PBVI algorithms produce {}-optimal strategies.
Reduction enables transfer of solution techniques from stochastic games.
Empirical results outperform existing methods.
Abstract
We present a novel framework for {\epsilon}-optimally solving two-player zero-sum partially observable stochastic games (zs-POSGs). These games pose a major challenge due to the absence of a principled connection with dynamic programming (DP) techniques developed for two-player zero-sum stochastic games (zs-SGs). Prior attempts at transferring solution methods have lacked a lossless reduction, defined here as a transformation that preserves value functions, equilibrium strategies, and optimality structure, thereby limiting generalisation to ad-hoc algorithms. This work introduces the first lossless reduction from zs-POSGs to transition-independent zs-SGs, enabling the principled application of a broad class of DP-based methods. We show empirically that point-based value iteration (PBVI) algorithms, applied via this reduction, produce {\epsilon}-optimal strategies across a range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Game Theory and Voting Systems · Risk and Portfolio Optimization
