Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning
Jinchi Chen, Jie Feng, Weiguo Gao, Ke Wei

TL;DR
This paper introduces MDNPG, a decentralized natural policy gradient algorithm with variance reduction, achieving optimal convergence and superior empirical results in multi-agent reinforcement learning with communication constraints.
Contribution
The paper proposes MDNPG, a novel decentralized policy gradient method combining natural gradient, momentum-based variance reduction, and gradient tracking, with proven optimal convergence rates.
Findings
MDNPG achieves $ ilde{O}(n^{-1} ext{poly}(rac{1}{ ext{epsilon}}))$ sample complexity.
MDNPG demonstrates linear speedup over centralized methods.
Empirical results show MDNPG outperforms existing algorithms in multi-agent tasks.
Abstract
This paper studies a policy optimization problem arising from collaborative multi-agent reinforcement learning in a decentralized setting where agents communicate with their neighbors over an undirected graph to maximize the sum of their cumulative rewards. A novel decentralized natural policy gradient method, dubbed Momentum-based Decentralized Natural Policy Gradient (MDNPG), is proposed, which incorporates natural gradient, momentum-based variance reduction, and gradient tracking into the decentralized stochastic gradient ascent framework. The sample complexity for MDNPG to converge to an -stationary point has been established under standard assumptions, where is the number of agents. It indicates that MDNPG can achieve the optimal convergence rate for decentralized policy gradient methods and possesses a linear speedup in contrast to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Control Multi-Agent Systems · Advanced MIMO Systems Optimization
