Hierarchical Imitation and Reinforcement Learning
Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dud\'ik, Yisong Yue,, Hal Daum\'e III

TL;DR
This paper introduces a hierarchical guidance framework that combines imitation learning and reinforcement learning to improve efficiency in long-horizon, sparse reward decision tasks, reducing expert effort and exploration costs.
Contribution
The paper presents a novel hierarchical guidance framework that effectively integrates IL and RL at different levels, enhancing learning speed and label efficiency in complex tasks.
Findings
Faster learning on long-horizon benchmarks like Montezuma's Revenge
Significant reduction in expert labeling effort
More label-efficient than standard imitation learning
Abstract
We study how to effectively leverage expert feedback to learn sequential decision-making policies. We focus on problems with sparse rewards and long time horizons, which typically pose significant challenges in reinforcement learning. We propose an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction. Our framework can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels, leading to dramatic reductions in both expert effort and cost of exploration. Using long-horizon benchmarks, including Montezuma's Revenge, we demonstrate that our approach can learn significantly faster than hierarchical RL, and be significantly more label-efficient than standard IL. We also theoretically analyze labeling cost for certain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing
