Causal Reinforcement Learning for Complex Card Games: A Magic The Gathering Benchmark
Cristiano da Costa Cunha, Ajmal Mian, Tim French, Wei Liu

TL;DR
This paper introduces MTG-Causal-RL, a comprehensive benchmark based on Magic: The Gathering, designed to evaluate causal reinforcement learning methods in complex, partially observed environments with explicit causal structures.
Contribution
It provides a new benchmark with a detailed causal model, multiple reward schemes, and baseline algorithms, including a novel causal graph-factored PPO variant, for advancing causal RL research.
Findings
Masked PPO and CGFA-PPO achieve competitive win rates.
Per-factor calibration reveals diagnostic structure beyond overall win rate.
The benchmark enables evaluation of causal credit assignment and structural transfer.
Abstract
Causal reinforcement learning (RL) lacks benchmarks for complex systems that combine sequential decision making, hidden information, large masked action spaces, and explicit causal structure. We introduce MTG-Causal-RL, a Gymnasium benchmark built on Magic: The Gathering with a 3,077-dimensional partial observation, a 478-action masked discrete action space, five competitive Standard archetypes, three reward schemes, and a hand-specified Structural Causal Model (SCM) over strategic variables. Every episode exposes causal variables, SCM-predicted intervention effects, and per-factor credit traces, making causal credit assignment, leave-one-out cross-archetype transfer, and policy auditability first-class metrics. We adapt a panel of reference baselines: random, heuristic, masked PPO, a causal-world-model PPO variant, and an architecture-matched scalar control. We propose Causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
