Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Zhiyao Zhang; Myeung Suk Oh; FNU Hairi; Ziyue Luo; Alvaro Velasquez; Jia Liu

arXiv:2505.18433·cs.LG·August 14, 2025

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Zhiyao Zhang, Myeung Suk Oh, FNU Hairi, Ziyue Luo, Alvaro Velasquez, Jia Liu

PDF

1 Video

TL;DR

This paper introduces a deep neural actor-critic method for decentralized multi-agent reinforcement learning, providing the first finite-time global convergence guarantee in this setting.

Contribution

It develops the first deep neural actor-critic algorithm for decentralized MARL with proven global optimality and finite-time convergence rate.

Findings

01

Achieves a finite-time convergence rate of O(1/T).

02

Provides the first global convergence guarantee for deep neural actor-critic in MARL.

03

Numerical experiments confirm theoretical results.

Abstract

Actor-critic methods for decentralized multi-agent reinforcement learning (MARL) facilitate collaborative optimal decision making without centralized coordination, thus enabling a wide range of applications in practice. To date, however, most theoretical convergence studies for existing actor-critic decentralized MARL methods are limited to the guarantee of a stationary solution under the linear function approximation. This leaves a significant gap between the highly successful use of deep neural actor-critic for decentralized MARL in practice and the current theoretical understanding. To bridge this gap, in this paper, we make the first attempt to develop a deep neural actor-critic method for decentralized MARL, where both the actor and critic components are inherently non-linear. We show that our proposed method enjoys a global optimality guarantee with a finite-time convergence rate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning· slideslive