Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning
Trevor Ablett, Bryan Chan, Jonathan Kelly

TL;DR
This paper introduces Learning from Guided Play (LfGP), a hierarchical framework that enhances exploration in adversarial imitation learning by leveraging multiple auxiliary tasks and expert demonstrations, improving efficiency and transferability.
Contribution
The paper proposes a novel hierarchical approach with task scheduling in AIL, enabling better exploration, reuse of expert data, and transfer learning across tasks.
Findings
LfGP outperforms supervised imitation learning.
LfGP surpasses state-of-the-art AIL methods.
Improves learning efficiency in robotic manipulation.
Abstract
Effective exploration continues to be a significant challenge that prevents the deployment of reinforcement learning for many physical systems. This is particularly true for systems with continuous and high-dimensional state and action spaces, such as robotic manipulators. The challenge is accentuated in the sparse rewards setting, where the low-level state information required for the design of dense rewards is unavailable. Adversarial imitation learning (AIL) can partially overcome this barrier by leveraging expert-generated demonstrations of optimal behaviour and providing, essentially, a replacement for dense reward information. Unfortunately, the availability of expert demonstrations does not necessarily improve an agent's capability to explore effectively and, as we empirically show, can lead to inefficient or stagnated learning. We present Learning from Guided Play (LfGP), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Robot Manipulation and Learning
