The Option Keyboard: Combining Skills in Reinforcement Learning

Andr\'e Barreto; Diana Borsa; Shaobo Hou; Gheorghe Comanici; Eser; Ayg\"un; Philippe Hamel; Daniel Toyama; Jonathan Hunt; Shibl Mourad; David; Silver; Doina Precup

arXiv:2106.13105·cs.AI·June 25, 2021

The Option Keyboard: Combining Skills in Reinforcement Learning

Andr\'e Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser, Ayg\"un, Philippe Hamel, Daniel Toyama, Jonathan Hunt, Shibl Mourad, David, Silver, Doina Precup

PDF

TL;DR

This paper introduces a framework for combining known skills in reinforcement learning by manipulating their associated pseudo-rewards, enabling instant synthesis of new skills through linear combinations of existing options.

Contribution

It formalizes how deterministic options can be represented as cumulants and shows how to linearly combine these to create new options without additional learning.

Findings

01

Enables instant synthesis of new skills from existing options.

02

Provides a hierarchical interface for combining skills in complex tasks.

03

Demonstrates practical benefits in resource management and robot navigation tasks.

Abstract

The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended periods. We argue that a robust way of combining skills is to define and manipulate them in the space of pseudo-rewards (or "cumulants"). Based on this premise, we propose a framework for combining skills using the formalism of options. We show that every deterministic option can be unambiguously represented as a cumulant defined in an extended domain. Building on this insight and on previous results on transfer learning, we show how to approximate options whose cumulants are linear combinations of the cumulants of known options. This means that, once we have learned options associated with a set of cumulants, we can instantaneously synthesise options induced by any linear combination of them, without any learning involved. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.