Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation

Yanjie Dong; Haijun Zhang; Gang Wang; Shisheng Cui; Xiping Hu

arXiv:2408.06945·cs.LG·August 19, 2024

Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation

Yanjie Dong, Haijun Zhang, Gang Wang, Shisheng Cui, Xiping Hu

PDF

Open Access

TL;DR

This paper introduces a heavy-ball momentum-based advantage actor-critic (HB-A2C) algorithm that accelerates convergence in reinforcement learning with Markovian noise by integrating momentum into the critic's linear function approximation.

Contribution

The paper proposes the first theoretical analysis of momentum's impact on actor-critic algorithms, demonstrating accelerated convergence and optimal iteration complexity.

Findings

01

HB-A2C achieves $ ilde{O}(rac{1}{\e^2})$ convergence rate.

02

Theoretical certification of acceleration under Markovian noise.

03

Learning rates depend on sample trajectory length.

Abstract

By using an parametric value function to replace the Monte-Carlo rollouts for value estimation, the actor-critic (AC) algorithms can reduce the variance of stochastic policy gradient so that to improve the convergence rate. While existing works mainly focus on analyzing convergence rate of AC algorithms under Markovian noise, the impacts of momentum on AC algorithms remain largely unexplored. In this work, we first propose a heavy-ball momentum based advantage actor-critic (\mbox{HB-A2C}) algorithm by integrating the heavy-ball momentum into the critic recursion that is parameterized by a linear function. When the sample trajectory follows a Markov decision process, we quantitatively certify the acceleration capability of the proposed HB-A2C algorithm. Our theoretical results demonstrate that the proposed HB-A2C finds an $ϵ$ -approximate stationary point with $\oo ϵ^{- 2}$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum chaos and dynamical systems · Chaos-based Image/Signal Encryption · Sports Dynamics and Biomechanics

MethodsFocus