Exploration-efficient Deep Reinforcement Learning with Demonstration   Guidance for Robot Control

Ke Lin; Liang Gong; Xudong Li; Te Sun; Binhao Chen; Chengliang Liu,; Zhengfeng Zhang; Jian Pu; Junping Zhang

arXiv:2002.12089·cs.RO·February 28, 2020·6 cites

Exploration-efficient Deep Reinforcement Learning with Demonstration Guidance for Robot Control

Ke Lin, Liang Gong, Xudong Li, Te Sun, Binhao Chen, Chengliang Liu,, Zhengfeng Zhang, Jian Pu, Junping Zhang

PDF

Open Access

TL;DR

This paper introduces a sample-efficient deep reinforcement learning method that leverages demonstration guidance to improve training stability and exploration in continuous control tasks, reducing the need for extensive demonstrations.

Contribution

It proposes the DRL-EG algorithm, combining a discriminator and guider modeled from few demonstrations to enhance exploration and sample efficiency in DRL.

Findings

01

Outperforms other RL and RLfD methods in continuous control tasks.

02

Helps agents escape local optima during training.

03

Requires fewer demonstrations than existing RLfD approaches.

Abstract

Although deep reinforcement learning (DRL) algorithms have made important achievements in many control tasks, they still suffer from the problems of sample inefficiency and unstable training process, which are usually caused by sparse rewards. Recently, some reinforcement learning from demonstration (RLfD) methods have shown to be promising in overcoming these problems. However, they usually require considerable demonstrations. In order to tackle these challenges, on the basis of the SAC algorithm we propose a sample efficient DRL-EG (DRL with efficient guidance) algorithm, in which a discriminator D(s) and a guider G(s) are modeled by a small number of expert demonstrations. The discriminator will determine the appropriate guidance states and the guider will guide agents to better exploration in the training phase. Empirical evaluation results from several continuous control tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adaptive Dynamic Programming Control