Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration   for Mean-Field Reinforcement Learning

Lingxiao Wang; Zhuoran Yang; Zhaoran Wang

arXiv:2006.11917·cs.LG·June 23, 2020·6 cites

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning

Lingxiao Wang, Zhuoran Yang, Zhaoran Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces a mean-field reinforcement learning algorithm called MF-FQI that leverages agent symmetry and mean embedding techniques, providing theoretical guarantees and demonstrating improved performance with more agents.

Contribution

It proposes the MF-FQI algorithm for mean-field MARL, offering a non-asymptotic analysis and revealing a 'blessing of many agents' effect.

Findings

01

MF-FQI achieves provable convergence.

02

More agents lead to better performance.

03

The method handles continuous mean-field states.

Abstract

Multi-agent reinforcement learning (MARL) achieves significant empirical successes. However, MARL suffers from the curse of many agents. In this paper, we exploit the symmetry of agents in MARL. In the most generic form, we study a mean-field MARL problem. Such a mean-field MARL is defined on mean-field states, which are distributions that are supported on continuous space. Based on the mean embedding of the distributions, we propose MF-FQI algorithm that solves the mean-field MARL and establishes a non-asymptotic analysis for MF-FQI algorithm. We highlight that MF-FQI algorithm enjoys a "blessing of many agents" property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning· slideslive

Taxonomy

TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics · Metaheuristic Optimization Algorithms Research