Provably Efficient Offline Multi-agent Reinforcement Learning via   Strategy-wise Bonus

Qiwen Cui; Simon S. Du

arXiv:2206.00159·cs.LG·October 17, 2022

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus

Qiwen Cui, Simon S. Du

PDF

Open Access 1 Video

TL;DR

This paper introduces a strategy-wise bonus approach for offline multi-agent reinforcement learning, achieving improved sample complexity bounds and computational efficiency over prior point-wise methods, especially in large action spaces.

Contribution

It proposes the strategy-wise concentration principle, leading to algorithms with better sample complexity and computational efficiency for multi-agent Markov games.

Findings

01

Sample complexity scales with sum of actions, not joint action space.

02

Algorithms can incorporate a pre-specified strategy class with logarithmic complexity.

03

Achieves better dependency on action size in two-player zero-sum games.

Abstract

This paper considers offline multi-agent reinforcement learning. We propose the strategy-wise concentration principle which directly builds a confidence interval for the joint strategy, in contrast to the point-wise concentration principle that builds a confidence interval for each point in the joint action space. For two-player zero-sum Markov games, by exploiting the convexity of the strategy-wise bonus, we propose a computationally efficient algorithm whose sample complexity enjoys a better dependency on the number of actions than the prior methods based on the point-wise bonus. Furthermore, for offline multi-agent general-sum Markov games, based on the strategy-wise bonus and a novel surrogate function, we give the first algorithm whose sample complexity only scales $\sum_{i = 1}^{m} A_{i}$ where $A_{i}$ is the action size of the $i$ -th player and $m$ is the number of players. In sharp…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Game Theory and Applications · Auction Theory and Applications