Learning Multi-Level Hierarchies with Hindsight
Andrew Levy, George Konidaris, Robert Platt, Kate Saenko

TL;DR
This paper introduces Hierarchical Actor-Critic (HAC), a new framework for hierarchical reinforcement learning that enables stable, parallel learning of multiple policy levels, significantly improving learning speed and scalability in complex tasks.
Contribution
HAC allows independent, parallel training of hierarchy levels by assuming lower levels are optimal, overcoming instability issues in hierarchical RL.
Findings
HAC accelerates learning compared to other methods.
First to learn 3-level hierarchies in continuous spaces.
Effective in grid world and robotics domains.
Abstract
Hierarchical agents have the potential to solve sequential decision making tasks with greater sample efficiency than their non-hierarchical counterparts because hierarchical agents can break down tasks into sets of subtasks that only require short sequences of decisions. In order to realize this potential of faster learning, hierarchical agents need to be able to learn their multiple levels of policies in parallel so these simpler subproblems can be solved simultaneously. Yet, learning multiple levels of policies in parallel is hard because it is inherently unstable: changes in a policy at one level of the hierarchy may cause changes in the transition and reward functions at higher levels in the hierarchy, making it difficult to jointly learn multiple levels of policies. In this paper, we introduce a new Hierarchical Reinforcement Learning (HRL) framework, Hierarchical Actor-Critic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
