Error Controlled Actor-Critic

Xingen Gao; Fei Chao; Changle Zhou; Zhen Ge; Chih-Min Lin; Longzhi; Yang; Xiang Chang; and Changjing Shang

arXiv:2109.02517·cs.LG·September 8, 2021

Error Controlled Actor-Critic

Xingen Gao, Fei Chao, Changle Zhou, Zhen Ge, Chih-Min Lin, Longzhi, Yang, Xiang Chang, and Changjing Shang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Error Controlled Actor-Critic, a novel reinforcement learning algorithm that constrains value function approximation errors to improve convergence and performance in continuous control tasks.

Contribution

It provides a theoretical analysis of how approximation errors affect actor-critic methods and proposes a new approach to limit these errors via KL-divergence constraints.

Findings

01

Reduces approximation error in value functions

02

Significantly outperforms existing model-free RL algorithms

03

Improves convergence stability in continuous control tasks

Abstract

On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SingerGao/ECAC
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Model Reduction and Neural Networks