A Small Gain Analysis of Single Timescale Actor Critic
Alex Olshevsky, Bahman Gharesifard

TL;DR
This paper analyzes a simplified single timescale actor-critic algorithm using small-gain theory, demonstrating improved sample complexity for finding approximate stationary points in reinforcement learning.
Contribution
It provides the first small-gain based analysis of a single timescale actor-critic method with proportional step-sizes and one critic update per actor step.
Findings
Proves convergence to a stationary point using small-gain theorem.
Achieves improved sample complexity of $O(^{-2} ^{-2})$ for -approximate stationary points.
Establishes conditions under which the method is effective.
Abstract
We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small-gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-critic methods to to find an -approximate stationary point where is the condition number associated with the critic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Reinforcement Learning in Robotics · Lattice Boltzmann Simulation Studies
