Flexible Option Learning

Martin Klissarov; Doina Precup

arXiv:2112.03097·cs.LG·December 7, 2021

Flexible Option Learning

Martin Klissarov, Doina Precup

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper extends intra-option learning in deep reinforcement learning to update all consistent options simultaneously, improving efficiency and performance in hierarchical RL frameworks.

Contribution

It introduces a method to update multiple options at once in deep RL, enhancing hierarchical learning without extra estimates.

Findings

01

Significant performance improvements across various domains.

02

Enhanced data efficiency in hierarchical RL.

03

Compatibility with existing option-critic algorithms.

Abstract

Temporal abstraction in reinforcement learning (RL), offers the promise of improving generalization and knowledge transfer in complex environments, by propagating information more efficiently over time. Although option learning was initially formulated in a way that allows updating many options simultaneously, using off-policy, intra-option learning (Sutton, Precup & Singh, 1999), many of the recent hierarchical reinforcement learning approaches only update a single option at a time: the option currently executing. We revisit and extend intra-option learning in the context of deep reinforcement learning, in order to enable updating all options consistent with current primitive action choices, without introducing any additional estimates. Our method can therefore be naturally adopted in most hierarchical RL frameworks. When we combine our approach with the option-critic algorithm for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mklissa/moc
tfOfficial

Videos

Flexible Option Learning· slideslive

Taxonomy

TopicsReservoir Engineering and Simulation Methods · Advanced Bandit Algorithms Research