Multi-agent learning under uncertainty: Recurrence vs. concentration
Kyriakos Lotidis, Panayotis Mertikopoulos, Nicholas Bambos, Jose Blanchet

TL;DR
This paper investigates the long-term behavior of multi-agent regularized learning in continuous games under uncertainty, revealing that dynamics may not converge but tend to concentrate around equilibrium in strongly monotone games.
Contribution
It provides a detailed analysis of the convergence landscape, showing how regularized learning behaves in stochastic settings and identifying conditions for concentration around equilibria.
Findings
Dynamics do not converge in general under uncertainty.
In strongly monotone games, trajectories return near equilibrium infinitely often.
Long-run distributions are sharply concentrated around equilibrium neighborhoods.
Abstract
In this paper, we examine the convergence landscape of multi-agent learning under uncertainty. Specifically, we analyze two stochastic models of regularized learning in continuous games -- one in continuous and one in discrete time with the aim of characterizing the long-run behavior of the induced sequence of play. In stark contrast to deterministic, full-information models of learning (or models with a vanishing learning rate), we show that the resulting dynamics do not converge in general. In lieu of this, we ask instead which actions are played more often in the long run, and by how much. We show that, in strongly monotone games, the dynamics of regularized learning may wander away from equilibrium infinitely often, but they always return to its vicinity in finite time (which we estimate), and their long-run distribution is sharply concentrated around a neighborhood thereof. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
