Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards

Zhixin Chen; Mengxiang Lin

arXiv:2010.06962·cs.LG·May 26, 2021·1 cites

Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards

Zhixin Chen, Mengxiang Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces Self-Imitation Learning with Constant Reward (SILCR), a method that guides robotic control in environments with sparse and delayed rewards by assigning constant immediate rewards based on final episodic rewards, improving performance and stability.

Contribution

The paper presents a novel self-imitation learning approach that does not require environment-provided immediate rewards, enabling effective learning in sparse and delayed reward settings.

Findings

01

Significantly outperforms alternative methods in MuJoCo tasks with sparse rewards.

02

Achieves competitive performance even with dense rewards available.

03

Demonstrates stability and reproducibility through ablation experiments.

Abstract

The application of reinforcement learning (RL) in robotic control is still limited in the environments with sparse and delayed rewards. In this paper, we propose a practical self-imitation learning method named Self-Imitation Learning with Constant Reward (SILCR). Instead of requiring hand-defined immediate rewards from environments, our method assigns the immediate rewards at each timestep with constant values according to their final episodic rewards. In this way, even if the dense rewards from environments are unavailable, every action taken by the agents would be guided properly. We demonstrate the effectiveness of our method in some challenging continuous robotics control tasks in MuJoCo simulation and the results show that our method significantly outperforms the alternative methods in tasks with sparse and delayed rewards. Even compared with alternatives with dense rewards…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gouxiangchen/SILCR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Locomotion and Control