Latent Space Policies for Hierarchical Reinforcement Learning
Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

TL;DR
This paper introduces a hierarchical reinforcement learning framework where each layer learns diverse strategies using maximum entropy objectives and latent variables, enabling complex task solving without restricting lower layers.
Contribution
It proposes a novel hierarchical policy method with latent variables and invertible mappings, allowing higher layers to control lower layers directly and improving performance on complex tasks.
Findings
Hierarchical policies outperform single-layer policies on benchmarks.
Adding layers enables solving more complex sparse-reward tasks.
Latent space control enhances policy expressivity and flexibility.
Abstract
We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective. Each layer is also augmented with latent random variables, which are sampled from a prior distribution during the training of that layer. The maximum entropy objective causes these latent variables to be incorporated into the layer's policy, and the higher level layer can directly control the behavior of the lower layer through this latent space. Furthermore, by constraining the mapping from latent variables to actions to be invertible, higher layers retain full expressivity:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
