Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation
Alain Andres, Esther Villar-Rodriguez, Javier Del Ser

TL;DR
This paper combines intrinsic motivation and self-imitation learning to enhance exploration and learning efficiency in environments with sparse rewards, leading to improved performance and generalization.
Contribution
It introduces a novel approach that integrates intrinsic motivation with self-imitation learning to better explore environments with sparse rewards.
Findings
Outperforms previous self-imitation methods in procedurally-generated environments.
Achieves comparable or better sample efficiency than using intrinsic motivation alone.
Enhances generalization in complex exploration tasks.
Abstract
Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or bad) the decisions made by the learned agent are. Unfortunately, in a broad range of problems the design of a good reward function is not trivial, so in such cases sparse reward signals are instead adopted. The lack of a dense reward function poses new challenges, mostly related to exploration. Imitation Learning has addressed those problems by leveraging demonstrations from experts. In the absence of an expert (and its subsequent demonstrations), an option is to prioritize well-suited exploration experiences collected by the agent in order to bootstrap its learning process with good exploration behaviors. However, this solution highly depends on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Advanced Multi-Objective Optimization Algorithms
