HIRL: Hierarchical Inverse Reinforcement Learning for Long-Horizon Tasks   with Delayed Rewards

Sanjay Krishnan; Animesh Garg; Richard Liaw; Lauren Miller; Florian T.; Pokorny; Ken Goldberg

arXiv:1604.06508·cs.RO·April 25, 2016·27 cites

HIRL: Hierarchical Inverse Reinforcement Learning for Long-Horizon Tasks with Delayed Rewards

Sanjay Krishnan, Animesh Garg, Richard Liaw, Lauren Miller, Florian T., Pokorny, Ken Goldberg

PDF

Open Access

TL;DR

HIRL introduces a hierarchical inverse reinforcement learning framework that learns sub-task structures from demonstrations, enabling more efficient and robust policy learning in long-horizon tasks with delayed rewards.

Contribution

The paper presents a novel hierarchical IRL method that decomposes tasks into sub-tasks based on demonstration consistency, improving reward learning and policy convergence in complex environments.

Findings

01

HIRL achieves 80% success rate in fewer steps than MaxEnt IRL.

02

HIRL rewards are robust to environment noise and perturbations.

03

HIRL converges up to 6 times faster than traditional IRL methods.

Abstract

Reinforcement Learning (RL) struggles in problems with delayed rewards, and one approach is to segment the task into sub-tasks with incremental rewards. We propose a framework called Hierarchical Inverse Reinforcement Learning (HIRL), which is a model for learning sub-task structure from demonstrations. HIRL decomposes the task into sub-tasks based on transitions that are consistent across demonstrations. These transitions are defined as changes in local linearity w.r.t to a kernel function. Then, HIRL uses the inferred structure to learn reward functions local to the sub-tasks but also handle any global dependencies such as sequentiality. We have evaluated HIRL on several standard RL benchmarks: Parallel Parking with noisy dynamics, Two-Link Pendulum, 2D Noisy Motion Planning, and a Pinball environment. In the parallel parking task, we find that rewards constructed with HIRL converge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · EEG and Brain-Computer Interfaces