The Option Keyboard: Combining Skills in Reinforcement Learning
Andr\'e Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser, Ayg\"un, Philippe Hamel, Daniel Toyama, Jonathan Hunt, Shibl Mourad, David, Silver, Doina Precup

TL;DR
This paper introduces a framework for combining known skills in reinforcement learning by manipulating their associated pseudo-rewards, enabling instant synthesis of new skills through linear combinations of existing options.
Contribution
It formalizes how deterministic options can be represented as cumulants and shows how to linearly combine these to create new options without additional learning.
Findings
Enables instant synthesis of new skills from existing options.
Provides a hierarchical interface for combining skills in complex tasks.
Demonstrates practical benefits in resource management and robot navigation tasks.
Abstract
The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended periods. We argue that a robust way of combining skills is to define and manipulate them in the space of pseudo-rewards (or "cumulants"). Based on this premise, we propose a framework for combining skills using the formalism of options. We show that every deterministic option can be unambiguously represented as a cumulant defined in an extended domain. Building on this insight and on previous results on transfer learning, we show how to approximate options whose cumulants are linear combinations of the cumulants of known options. This means that, once we have learned options associated with a set of cumulants, we can instantaneously synthesise options induced by any linear combination of them, without any learning involved. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
