Option Discovery in Hierarchical Reinforcement Learning using Spatio-Temporal Clustering
Aravind Srinivas, Ramnandan Krishnamurthy, Peeyush Kumar, Balaraman, Ravindran

TL;DR
This paper presents a hierarchical reinforcement learning framework that automatically discovers skills by identifying abstract states through spatio-temporal clustering, enabling efficient and reusable policy learning across multiple tasks.
Contribution
It introduces a novel method combining dynamical systems and spectral clustering to identify abstract states and skills, scalable to complex tasks with large state spaces.
Findings
Effective skill discovery using metastable regions and spectral clustering.
Skills are reusable across different tasks without relearning.
Scalable to large state spaces with learned state representations.
Abstract
This paper introduces an automated skill acquisition framework in reinforcement learning which involves identifying a hierarchical description of the given task in terms of abstract states and extended actions between abstract states. Identifying such structures present in the task provides ways to simplify and speed up reinforcement learning algorithms. These structures also help to generalize such algorithms over multiple tasks without relearning policies from scratch. We use ideas from dynamical systems to find metastable regions in the state space and associate them with abstract states. The spectral clustering algorithm PCCA+ is used to identify suitable abstractions aligned to the underlying structure. Skills are defined in terms of the sequence of actions that lead to transitions between such abstract states. The connectivity information from PCCA+ is used to generate these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Evolutionary Algorithms and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Spectral Clustering
