"Give Me an Example Like This": Episodic Active Reinforcement Learning from Demonstrations
Muhan Hou, Koen Hindriks, A.E. Eiben, Kim Baraka

TL;DR
This paper introduces EARLY, an active learning algorithm for reinforcement learning from demonstrations, which optimizes the timing and content of queries to improve sample efficiency and human teaching experience.
Contribution
EARLY is a novel episodic active learning method that selectively queries demonstrations in a trajectory-based feature space, enhancing learning efficiency and user experience.
Findings
Achieves expert-level performance in navigation tasks with 30% faster convergence.
Outperforms baseline methods in simulated environments.
Reduces human demonstration time while maintaining learning quality.
Abstract
Reinforcement Learning (RL) has achieved great success in sequential decision-making problems, but often at the cost of a large number of agent-environment interactions. To improve sample efficiency, methods like Reinforcement Learning from Expert Demonstrations (RLED) introduce external expert demonstrations to facilitate agent exploration during the learning process. In practice, these demonstrations, which are often collected from human users, are costly and hence often constrained to a limited amount. How to select the best set of human demonstrations that is most beneficial for learning therefore becomes a major concern. This paper presents EARLY (Episodic Active Learning from demonstration querY), an algorithm that enables a learning agent to generate optimized queries of expert demonstrations in a trajectory-based feature space. Based on a trajectory-level estimate of uncertainty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications
MethodsSparse Evolutionary Training
