A Lyapunov Optimization Approach to Repeated Stochastic Games
Michael J. Neely

TL;DR
This paper introduces a Lyapunov optimization framework for repeated stochastic games, enabling online decision-making to maximize utilities under equilibrium constraints without prior probability knowledge.
Contribution
It develops a novel Lyapunov-based online algorithm for stochastic games that achieves equilibrium constraints and maximizes utility functions without needing probability distributions.
Findings
Algorithm converges in polynomial time.
Players are incentivized to participate through equilibrium constraints.
Method can compute correlated equilibrium with higher complexity.
Abstract
This paper considers a time-varying game with players. Every time slot, players observe their own random events and then take a control action. The events and control actions affect the individual utilities earned by each player. The goal is to maximize a concave function of time average utilities subject to equilibrium constraints. Specifically, participating players are provided access to a common source of randomness from which they can optimally correlate their decisions. The equilibrium constraints incentivize participation by ensuring that players cannot earn more utility if they choose not to participate. This form of equilibrium is similar to the notions of Nash equilibrium and correlated equilibrium, but is simpler to attain. A Lyapunov method is developed that solves the problem in an online \emph{max-weight} fashion by selecting actions based on a set of time-varying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Game Theory and Voting Systems · Advanced Bandit Algorithms Research
