CGAR: Critic Guided Action Redistribution in Reinforcement Leaning
Tairan Huang, Xu Li, Hao Li, Mingming Sun, Ping Li

TL;DR
This paper introduces CGAR, a novel reinforcement learning algorithm that uses critic guidance to improve action redistribution, leading to better sample efficiency and state-of-the-art results on MuJoCo tasks.
Contribution
The paper proposes the CGAR algorithm, leveraging critic signals for action redistribution in off-policy actor-critic methods, enhancing learning efficiency.
Findings
Improved sample efficiency on MuJoCo tasks.
Achieved state-of-the-art performance.
Demonstrated critic's superior expected rewards.
Abstract
Training a game-playing reinforcement learning agent requires multiple interactions with the environment. Ignorant random exploration may cause a waste of time and resources. It's essential to alleviate such waste. As discussed in this paper, under the settings of the off-policy actor critic algorithms, we demonstrate that the critic can bring more expected discounted rewards than or at least equal to the actor. Thus, the Q value predicted by the critic is a better signal to redistribute the action originally sampled from the policy distribution predicted by the actor. This paper introduces the novel Critic Guided Action Redistribution (CGAR) algorithm and tests it on the OpenAI MuJoCo tasks. The experimental results demonstrate that our method improves the sample efficiency and achieves state-of-the-art performance. Our code can be found at https://github.com/tairanhuang/CGAR.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
