Decentralized Q-Learning for Stochastic Teams and Games

G\"urdal Arslan; Serdar Y\"uksel

arXiv:1506.07924·math.OC·May 3, 2016

Decentralized Q-Learning for Stochastic Teams and Games

G\"urdal Arslan, Serdar Y\"uksel

PDF

TL;DR

This paper introduces decentralized Q-learning algorithms for stochastic games, demonstrating their convergence to equilibrium policies in weakly acyclic cases with local information, addressing challenges of learning in multi-agent stochastic environments.

Contribution

It presents novel decentralized Q-learning algorithms for stochastic games and proves their convergence in weakly acyclic scenarios, including team problems, with minimal information requirements.

Findings

01

Algorithms converge to equilibrium policies almost surely

02

Decentralized approach requires only local information

03

Applicable to a broad class of stochastic games

Abstract

There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In stochastic dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic games, and study their convergence for the weakly acyclic case which includes team problems as an important special case. The algorithm is decentralized in that each decision maker has access…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.