Finite-time analysis of single-timescale actor-critic

Xuyang Chen; Lin Zhao

arXiv:2210.09921·cs.LG·January 29, 2024·1 cites

Finite-time analysis of single-timescale actor-critic

Xuyang Chen, Lin Zhao

PDF

Open Access 1 Video

TL;DR

This paper provides the first finite-time convergence analysis of online single-timescale actor-critic algorithms in continuous state spaces with linear function approximation, showing they find approximate stationary points efficiently.

Contribution

It introduces a novel framework for analyzing error propagation in single-timescale actor-critic methods, establishing convergence guarantees under practical Markovian sampling.

Findings

01

Achieves $ ilde{O}(rac{1}{\e^2})$ sample complexity for convergence.

02

Improves to $O(rac{1}{\e^2})$ under i.i.d. sampling.

03

Provides systematic error control between actor and critic.

Abstract

Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing single-timescale actor-critic have been limited to i.i.d. sampling or tabular setting for simplicity. We investigate the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic assumes linear function approximation and updates with a single Markovian sample per actor step. Previous analysis has been unable to establish the convergence for such a challenging scenario. We demonstrate that the online single-timescale actor-critic method provably finds an $ϵ$ -approximate stationary point with $O (ϵ^{- 2})$ sample complexity under standard assumptions, which can be further improved to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Finite-Time Analysis of Single-Timescale Actor-Critic· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Advancements in Semiconductor Devices and Circuit Design