Integrating Causal DAGs in Deep RL: Activating Minimal Markovian States with Multi-Order Exposure
Jiamin Xu, Jacqueline Maasch, Kyra Gan

TL;DR
This paper introduces MOSE, a method that constructs multi-order historical states to improve deep RL performance by leveraging causal graphs, demonstrating consistent empirical gains over minimal state representations.
Contribution
It provides a provably minimal state construction from causal graphs and proposes MOSE, which incorporates multi-order histories to enhance deep RL performance.
Findings
MOSE outperforms minimal state and single-window policies on benchmarks.
Including minimal representation with MOSE further improves results.
Controlled redundancy in state representations is crucial for causal deep RL.
Abstract
Online reinforcement learning (RL) relies on the Markov property for guaranteed performance, but real-world applications often lack well-defined states given raw observed variables. While causal RL has attracted growing interest, existing work typically assumes Markovian states are provided and focuses on using causality to accelerate learning, leaving a fundamental gap: \emph{given a longitudinal causal graph over observed variables, how does one construct MDP states that provably satisfy the Markov property?} We address this by providing a procedure that constructs a provably minimal state representation. In deep RL, we observe that the minimal representation alone empirically fails to improve performance, indicating that neural networks cannot directly exploit Markovian minimality. To address this, we propose \textbf{MOSE} (Multi-Order State Exposure), which feeds multi-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
