Finite-Time Analysis of Asynchronous Multi-Agent TD Learning
Nicol\`o Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas

TL;DR
This paper proves that asynchronous multi-agent TD learning algorithms can achieve linear convergence speedup similar to synchronous ones, even with delays, in a policy evaluation setting.
Contribution
First analysis of finite-time convergence for asynchronous multi-agent TD learning with delays, demonstrating linear speedup in convergence rate.
Findings
Asynchronous delays do not prevent linear convergence speedup.
Finite-time analysis accounts for time-varying delays.
AsyncMATD achieves similar convergence rates as synchronous algorithms.
Abstract
Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent reinforcement learning (MARL). In a setting involving agents, this beneficial effect usually comes in the form of an -fold linear convergence speedup, i.e., a reduction - proportional to - in the number of iterations required to reach a certain convergence precision. In this paper, we show for the first time that this speedup property also holds for a MARL framework subject to asynchronous delays in the local agents' updates. In particular, we consider a policy evaluation problem in which multiple agents cooperate to evaluate a common policy by communicating with a central aggregator. In this setting, we study the finite-time convergence of \texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD) learning algorithm in which agents' local TD update directions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
