GRAC: Self-Guided and Self-Regularized Actor-Critic
Lin Shao, Yifan You, Mengyuan Yan, Qingyun Sun, Jeannette Bohg

TL;DR
GRAC introduces a self-regularized TD-learning and a self-guided policy improvement method, enabling more robust and efficient deep reinforcement learning without the need for target networks, and achieves state-of-the-art results.
Contribution
It proposes a novel actor-critic algorithm that eliminates the need for target networks through self-regularization and enhances policy improvement with zero-order optimization.
Findings
Achieves or outperforms state-of-the-art on OpenAI gym tasks
Demonstrates robustness to local noise in Q-function approximation
Eliminates the need for target networks in deep RL
Abstract
Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a range of challenging decision making and control tasks. One dominant component of recent deep reinforcement learning algorithms is the target network which mitigates the divergence when learning the Q function. However, target networks can slow down the learning process due to delayed function updates. Our main contribution in this work is a self-regularized TD-learning method to address divergence without requiring a target network. Additionally, we propose a self-guided policy improvement method by combining policy-gradient with zero-order optimization to search for actions associated with higher Q-values in a broad neighborhood. This makes learning more robust to local noise in the Q function approximation and guides the updates of our actor network. Taken together, these components define GRAC, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Artificial Intelligence in Games
