Hierarchy through Composition with Linearly Solvable Markov Decision   Processes

Andrew M. Saxe; Adam Earle; Benjamin Rosman

arXiv:1612.02757·cs.AI·December 9, 2016

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Andrew M. Saxe, Adam Earle, Benjamin Rosman

PDF

Open Access

TL;DR

This paper introduces a novel hierarchical reinforcement learning framework that leverages linearly solvable Markov decision processes to enable concurrent execution of multiple macro-actions, improving scalability and flexibility.

Contribution

It proposes a new control hierarchy based on parallel macro-actions using LMDPs, allowing for deep, compositional task representations in reinforcement learning.

Findings

01

Enables concurrent execution of macro-actions in RL.

02

Supports deep hierarchical task representations.

03

Improves scalability of reinforcement learning architectures.

Abstract

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Scheduling and Optimization Algorithms