Convergence Rates of Average-Reward Multi-agent Reinforcement Learning   via Randomized Linear Programming

Alec Koppel; Amrit Singh Bedi; Bhargav Ganguly; Vaneet Aggarwal

arXiv:2110.12929·math.OC·October 26, 2021·CDC

Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming

Alec Koppel, Amrit Singh Bedi, Bhargav Ganguly, Vaneet Aggarwal

PDF

Open Access

TL;DR

This paper analyzes the convergence rates of multi-agent reinforcement learning under average reward criteria, proposing a linear programming approach with optimal sample complexity guarantees and validating through experiments.

Contribution

It introduces a multi-agent linear programming framework with stochastic primal-dual methods, achieving near-optimal sample complexity for average-reward MARL.

Findings

01

Sample complexity scales optimally with state and action space sizes.

02

Multi-agent LP approach converges to near-globally optimal solutions.

03

Experimental results support theoretical convergence guarantees.

Abstract

In tabular multi-agent reinforcement learning with average-cost criterion, a team of agents sequentially interacts with the environment and observes local incentives. We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability. To date, few global optimality guarantees exist even for this simple setting, as most results yield convergence to stationarity for parameterized policies in large/possibly continuous spaces. To solidify the foundations of MARL, we build upon linear programming (LP) reformulations, for which stochastic primal-dual methods yields a model-free approach to achieve \emph{optimal sample complexity} in the centralized case. We develop multi-agent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging. We establish that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics