Granger Causal Interaction Skill Chains
Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum

TL;DR
The paper introduces COInS, a hierarchical RL method that learns a small set of controllable, task-agnostic skills through interaction detection, improving sample efficiency and transferability in complex, high-dimensional tasks.
Contribution
COInS presents a novel approach to skill discovery focusing on controllability in factored domains, enabling effective transfer and reducing the need for initial task success.
Findings
COInS outperforms standard RL in a robotic pushing task with obstacles.
Skills learned by COInS transfer effectively to variants of Breakout.
Achieves 2-3x better sample efficiency and final performance.
Abstract
Reinforcement Learning (RL) has demonstrated promising results in learning policies for complex tasks, but it often suffers from low sample efficiency and limited transferability. Hierarchical RL (HRL) methods aim to address the difficulty of learning long-horizon tasks by decomposing policies into skills, abstracting states, and reusing skills in new tasks. However, many HRL methods require some initial task success to discover useful skills, which paradoxically may be very unlikely without access to useful skills. On the other hand, reward-free HRL methods often need to learn far too many skills to achieve proper coverage in high-dimensional domains. In contrast, we introduce the Chain of Interaction Skills (COInS) algorithm, which focuses on controllability in factored domains to identify a small number of task-agnostic skills that still permit a high degree of control. COInS uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment · Intelligent Tutoring Systems and Adaptive Learning
