On the Convergence of Single-Timescale Actor-Critic

Navdeep Kumar; Priyank Agrawal; Giorgia Ramponi; Kfir Yehuda Levy; Shie Mannor

arXiv:2410.08868·cs.LG·June 5, 2025

On the Convergence of Single-Timescale Actor-Critic

Navdeep Kumar, Priyank Agrawal, Giorgia Ramponi, Kfir Yehuda Levy, Shie Mannor

PDF

Open Access

TL;DR

This paper proves that a single-timescale actor-critic algorithm for MDPs converges globally to an optimal policy with improved sample complexity, requiring specific step size decay rates.

Contribution

It introduces a new analytical framework and establishes the first global convergence proof with improved sample complexity for single-timescale AC algorithms.

Findings

01

Converges to a globally optimal policy with sample complexity O(ε^{-3})

02

Requires actor and critic step sizes to decay as O(k^{-2/3})

03

Improves upon previous complexity bounds for actor-critic algorithms

Abstract

We analyze the global convergence of the single-timescale actor-critic (AC) algorithm for the infinite-horizon discounted Markov Decision Processes (MDPs) with finite state spaces. To this end, we introduce an elegant analytical framework for handling complex, coupled recursions inherent in the algorithm. Leveraging this framework, we establish that the algorithm converges to an $ϵ$ -close \textbf{globally optimal} policy with a sample complexity of $ O(\epsilon^{-3}) $. This significantly improves upon the existing complexity of $O (ϵ^{- 2})$ to achieve $ϵ$ -close \textbf{stationary policy}, which is equivalent to the complexity of $O (ϵ^{- 4})$ to achieve $ϵ$ -close \textbf{globally optimal} policy using gradient domination lemma. Furthermore, we demonstrate that to achieve this improvement, the step sizes for both the actor and critic must decay as \(…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Neural Networks and Reservoir Computing · Computability, Logic, AI Algorithms