Causal-Paced Deep Reinforcement Learning
Geonwoo Cho, Jaegyun Im, Doyoon Kim, Sundong Kim

TL;DR
This paper introduces CP-DRL, a curriculum reinforcement learning framework that leverages causal relationships between tasks based on interaction data to improve exploration, transfer, and sample efficiency.
Contribution
CP-DRL is the first method to approximate SCM differences from interaction data for curriculum learning in RL, enhancing transfer and exploration without ground-truth causal structures.
Findings
Outperforms existing curriculum methods on Point Mass benchmark
Achieves faster convergence and higher returns
Reduces variance in Bipedal Walker-Trivial setting
Abstract
Designing effective task sequences is crucial for curriculum reinforcement learning (CRL), where agents must gradually acquire skills by training on intermediate tasks. A key challenge in CRL is to identify tasks that promote exploration, yet are similar enough to support effective transfer. While recent approach suggests comparing tasks via their Structural Causal Models (SCMs), the method requires access to ground-truth causal structures, an unrealistic assumption in most RL settings. In this work, we propose Causal-Paced Deep Reinforcement Learning (CP-DRL), a curriculum learning framework aware of SCM differences between tasks based on interaction data approximation. This signal captures task novelty, which we combine with the agent's learnability, measured by reward gain, to form a unified objective. Empirically, CP-DRL outperforms existing curriculum methods on the Point Mass…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
