HIRL: Hierarchical Inverse Reinforcement Learning for Long-Horizon Tasks with Delayed Rewards
Sanjay Krishnan, Animesh Garg, Richard Liaw, Lauren Miller, Florian T., Pokorny, Ken Goldberg

TL;DR
HIRL introduces a hierarchical inverse reinforcement learning framework that learns sub-task structures from demonstrations, enabling more efficient and robust policy learning in long-horizon tasks with delayed rewards.
Contribution
The paper presents a novel hierarchical IRL method that decomposes tasks into sub-tasks based on demonstration consistency, improving reward learning and policy convergence in complex environments.
Findings
HIRL achieves 80% success rate in fewer steps than MaxEnt IRL.
HIRL rewards are robust to environment noise and perturbations.
HIRL converges up to 6 times faster than traditional IRL methods.
Abstract
Reinforcement Learning (RL) struggles in problems with delayed rewards, and one approach is to segment the task into sub-tasks with incremental rewards. We propose a framework called Hierarchical Inverse Reinforcement Learning (HIRL), which is a model for learning sub-task structure from demonstrations. HIRL decomposes the task into sub-tasks based on transitions that are consistent across demonstrations. These transitions are defined as changes in local linearity w.r.t to a kernel function. Then, HIRL uses the inferred structure to learn reward functions local to the sub-tasks but also handle any global dependencies such as sequentiality. We have evaluated HIRL on several standard RL benchmarks: Parallel Parking with noisy dynamics, Two-Link Pendulum, 2D Noisy Motion Planning, and a Pinball environment. In the parallel parking task, we find that rewards constructed with HIRL converge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · EEG and Brain-Computer Interfaces
