Decentralized Optimal Equilibrium Learning in Stochastic Games via Single-bit Feedback

Seref Taha Kiremitci; Ahmed Said Donmez; Muhammed O. Sayin

arXiv:2602.12830·cs.GT·February 16, 2026

Decentralized Optimal Equilibrium Learning in Stochastic Games via Single-bit Feedback

Seref Taha Kiremitci, Ahmed Said Donmez, Muhammed O. Sayin

PDF

Open Access

TL;DR

This paper introduces a decentralized learning method for stochastic games that uses minimal feedback to coordinate agents on welfare-optimizing equilibria, with proven finite-time regret guarantees.

Contribution

It proposes a novel single-bit feedback signaling mechanism for decentralized equilibrium selection that aligns with social welfare objectives in stochastic games.

Findings

01

Achieves logarithmic expected regret under mild conditions.

02

Develops explore-and-commit and online algorithms for general stochastic games.

03

Handles heterogeneous model-based and model-free methods.

Abstract

We study decentralized equilibrium selection in stochastic games under severe information and communication constraints. In such settings, convergence to equilibrium alone is insufficient, as stochastic games typically admit many equilibria with markedly different welfare properties. We address decentralized optimal equilibrium selection, where agents coordinate on equilibria that optimize a designer-specified social welfare objective while allowing heterogeneous tolerance to deviations from strict best responses. Agents observe only the global state trajectory and their realized rewards, and exchange a single randomized bit of feedback per agent per round. This semantic content/discontent signaling mechanism implicitly aligns decentralized learning dynamics with the global welfare objective. We develop explore-and-commit and online variants applicable to general stochastic games,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Advanced Bandit Algorithms Research · Reinforcement Learning in Robotics