Autonomous Option Invention for Continual Hierarchical Reinforcement Learning and Planning
Rashmeet Kaur Nayyar, Siddharth Srivastava

TL;DR
This paper introduces a novel method for autonomously inventing and utilizing symbolic options in continual reinforcement learning, enabling better transfer, generalization, and planning in complex, long-horizon tasks.
Contribution
It presents a new approach for continual learning of interpretable, reusable, and independent options with symbolic representations, integrating search with RL for improved planning.
Findings
Effective transfer of abstract knowledge across tasks
Superior sample efficiency over state-of-the-art methods
Options meet key desiderata: composability, reusability, independence
Abstract
Abstraction is key to scaling up reinforcement learning (RL). However, autonomously learning abstract state and action representations to enable transfer and generalization remains a challenging open problem. This paper presents a novel approach for inventing, representing, and utilizing options, which represent temporally extended behaviors, in continual RL settings. Our approach addresses streams of stochastic problems characterized by long horizons, sparse rewards, and unknown transition and reward functions. Our approach continually learns and maintains an interpretable state abstraction, and uses it to invent high-level options with abstract symbolic representations. These options meet three key desiderata: (1) composability for solving tasks effectively with lookahead planning, (2) reusability across problem instances for minimizing the need for relearning, and (3) mutual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Transportation and Mobility Innovations
