CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

Tairan Huang; Xu Li; Hao Li; Mingming Sun; Ping Li

arXiv:2206.11494·cs.LG·June 24, 2022

CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

Tairan Huang, Xu Li, Hao Li, Mingming Sun, Ping Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces CGAR, a novel reinforcement learning algorithm that uses critic guidance to improve action redistribution, leading to better sample efficiency and state-of-the-art results on MuJoCo tasks.

Contribution

The paper proposes the CGAR algorithm, leveraging critic signals for action redistribution in off-policy actor-critic methods, enhancing learning efficiency.

Findings

01

Improved sample efficiency on MuJoCo tasks.

02

Achieved state-of-the-art performance.

03

Demonstrated critic's superior expected rewards.

Abstract

Training a game-playing reinforcement learning agent requires multiple interactions with the environment. Ignorant random exploration may cause a waste of time and resources. It's essential to alleviate such waste. As discussed in this paper, under the settings of the off-policy actor critic algorithms, we demonstrate that the critic can bring more expected discounted rewards than or at least equal to the actor. Thus, the Q value predicted by the critic is a better signal to redistribute the action originally sampled from the policy distribution predicted by the actor. This paper introduces the novel Critic Guided Action Redistribution (CGAR) algorithm and tests it on the OpenAI MuJoCo tasks. The experimental results demonstrate that our method improves the sample efficiency and achieves state-of-the-art performance. Our code can be found at https://github.com/tairanhuang/CGAR.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tairanhuang/cgar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics