Decentralized Natural Policy Gradient with Variance Reduction for   Collaborative Multi-Agent Reinforcement Learning

Jinchi Chen; Jie Feng; Weiguo Gao; Ke Wei

arXiv:2209.02179·math.OC·September 7, 2022·J. Mach. Learn. Res.

Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning

Jinchi Chen, Jie Feng, Weiguo Gao, Ke Wei

PDF

Open Access 1 Repo

TL;DR

This paper introduces MDNPG, a decentralized natural policy gradient algorithm with variance reduction, achieving optimal convergence and superior empirical results in multi-agent reinforcement learning with communication constraints.

Contribution

The paper proposes MDNPG, a novel decentralized policy gradient method combining natural gradient, momentum-based variance reduction, and gradient tracking, with proven optimal convergence rates.

Findings

01

MDNPG achieves $ ilde{O}(n^{-1} ext{poly}(rac{1}{ ext{epsilon}}))$ sample complexity.

02

MDNPG demonstrates linear speedup over centralized methods.

03

Empirical results show MDNPG outperforms existing algorithms in multi-agent tasks.

Abstract

This paper studies a policy optimization problem arising from collaborative multi-agent reinforcement learning in a decentralized setting where agents communicate with their neighbors over an undirected graph to maximize the sum of their cumulative rewards. A novel decentralized natural policy gradient method, dubbed Momentum-based Decentralized Natural Policy Gradient (MDNPG), is proposed, which incorporates natural gradient, momentum-based variance reduction, and gradient tracking into the decentralized stochastic gradient ascent framework. The $O (n^{- 1} ϵ^{- 3})$ sample complexity for MDNPG to converge to an $ϵ$ -stationary point has been established under standard assumptions, where $n$ is the number of agents. It indicates that MDNPG can achieve the optimal convergence rate for decentralized policy gradient methods and possesses a linear speedup in contrast to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fccc0417/mdnpg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Advanced MIMO Systems Optimization