Error Controlled Actor-Critic
Xingen Gao, Fei Chao, Changle Zhou, Zhen Ge, Chih-Min Lin, Longzhi, Yang, Xiang Chang, and Changjing Shang

TL;DR
This paper introduces Error Controlled Actor-Critic, a novel reinforcement learning algorithm that constrains value function approximation errors to improve convergence and performance in continuous control tasks.
Contribution
It provides a theoretical analysis of how approximation errors affect actor-critic methods and proposes a new approach to limit these errors via KL-divergence constraints.
Findings
Reduces approximation error in value functions
Significantly outperforms existing model-free RL algorithms
Improves convergence stability in continuous control tasks
Abstract
On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Model Reduction and Neural Networks
