Causal-Paced Deep Reinforcement Learning

Geonwoo Cho; Jaegyun Im; Doyoon Kim; Sundong Kim

arXiv:2507.02910·cs.LG·July 8, 2025

Causal-Paced Deep Reinforcement Learning

Geonwoo Cho, Jaegyun Im, Doyoon Kim, Sundong Kim

PDF

TL;DR

This paper introduces CP-DRL, a curriculum reinforcement learning framework that leverages causal relationships between tasks based on interaction data to improve exploration, transfer, and sample efficiency.

Contribution

CP-DRL is the first method to approximate SCM differences from interaction data for curriculum learning in RL, enhancing transfer and exploration without ground-truth causal structures.

Findings

01

Outperforms existing curriculum methods on Point Mass benchmark

02

Achieves faster convergence and higher returns

03

Reduces variance in Bipedal Walker-Trivial setting

Abstract

Designing effective task sequences is crucial for curriculum reinforcement learning (CRL), where agents must gradually acquire skills by training on intermediate tasks. A key challenge in CRL is to identify tasks that promote exploration, yet are similar enough to support effective transfer. While recent approach suggests comparing tasks via their Structural Causal Models (SCMs), the method requires access to ground-truth causal structures, an unrealistic assumption in most RL settings. In this work, we propose Causal-Paced Deep Reinforcement Learning (CP-DRL), a curriculum learning framework aware of SCM differences between tasks based on interaction data approximation. This signal captures task novelty, which we combine with the agent's learnability, measured by reward gain, to form a unified objective. Empirically, CP-DRL outperforms existing curriculum methods on the Point Mass…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.