Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients
Jiafei Lyu, Yu Yang, Jiangpeng Yan, Xiu Li

TL;DR
This paper introduces GD3, a novel reinforcement learning algorithm that employs a generalized-activated weighting operator to improve value estimation, effectively reducing bias and enhancing performance in continuous control tasks.
Contribution
It proposes a new bias correction method using activation functions in value estimation, leading to a versatile and effective algorithm for deep reinforcement learning.
Findings
GD3 alleviates estimation bias in value functions.
Activation functions improve convergence speed and performance.
Task-specific activation functions outperform standard baselines.
Abstract
It is vital to accurately estimate the value function in Deep Reinforcement Learning (DRL) such that the agent could execute proper actions instead of suboptimal ones. However, existing actor-critic methods suffer more or less from underestimation bias or overestimation bias, which negatively affect their performance. In this paper, we reveal a simple but effective principle: proper value correction benefits bias alleviation, where we propose the generalized-activated weighting operator that uses any non-decreasing function, namely activation function, as weights for better value estimation. Particularly, we integrate the generalized-activated weighting operator into value estimation and introduce a novel algorithm, Generalized-activated Deep Double Deterministic Policy Gradients (GD3). We theoretically show that GD3 is capable of alleviating the potential estimation bias. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
