Global Optimality of Single-Timescale Actor-Critic under Continuous   State-Action Space: A Study on Linear Quadratic Regulator

Xuyang Chen; Jingliang Duan; Lin Zhao

arXiv:2505.01041·cs.LG·May 9, 2025

Global Optimality of Single-Timescale Actor-Critic under Continuous State-Action Space: A Study on Linear Quadratic Regulator

Xuyang Chen, Jingliang Duan, Lin Zhao

PDF

TL;DR

This paper proves that a common single-timescale actor-critic algorithm can efficiently find near-optimal solutions for continuous state-action problems like LQR, bridging the gap between theory and practical success.

Contribution

It demonstrates the global optimality of single-timescale actor-critic algorithms in continuous spaces, a significant theoretical advancement.

Findings

01

Achieves epsilon-optimal solutions with order epsilon^-2 sample complexity.

02

Extends theoretical understanding of single-timescale actor-critic to continuous state-action spaces.

03

Bridges the gap between practical success and theoretical analysis of actor-critic methods.

Abstract

Actor-critic methods have achieved state-of-the-art performance in various challenging tasks. However, theoretical understandings of their performance remain elusive and challenging. Existing studies mostly focus on practically uncommon variants such as double-loop or two-timescale stepsize actor-critic algorithms for simplicity. These results certify local convergence on finite state- or action-space only. We push the boundary to investigate the classic single-sample single-timescale actor-critic on continuous (infinite) state-action space, where we employ the canonical linear quadratic regulator (LQR) problem as a case study. We show that the popular single-timescale actor-critic can attain an epsilon-optimal solution with an order of epsilon to -2 sample complexity for solving LQR on the demanding continuous state-action space. Our work provides new insights into the performance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.