Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo, Xiao Li

TL;DR
This paper provides the first finite-time convergence analysis of a decentralized single-timescale Actor-Critic algorithm in multi-agent reinforcement learning, showing optimal sample complexity and practical advantages.
Contribution
It introduces a novel theoretical analysis of decentralized single-timescale AC, revealing hidden smoothness properties and establishing optimal sample complexity results.
Findings
Achieves sample complexity of (\u03b5^{-2}) for decentralized AC.
Yields new sample complexity results for centralized AC with single-timescale updates.
Demonstrates superior performance of the proposed algorithm over existing methods.
Abstract
Decentralized Actor-Critic (AC) algorithms have been widely utilized for multi-agent reinforcement learning (MARL) and have achieved remarkable success. Apart from its empirical success, the theoretical convergence property of decentralized AC algorithms is largely unexplored. Most of the existing finite-time convergence results are derived based on either double-loop update or two-timescale step sizes rule, and this is the case even for centralized AC algorithm under a single-agent setting. In practice, the \emph{single-timescale} update is widely utilized, where actor and critic are updated in an alternating manner with step sizes being of the same order. In this work, we study a decentralized \emph{single-timescale} AC algorithm.Theoretically, using linear approximation for value and reward estimation, we show that the algorithm has sample complexity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Game Theory and Applications · Auction Theory and Applications
