Improved Monte Carlo Planning via Causal Disentanglement for Structurally-Decomposed Markov Decision Processes
Larkin Liu, Shiqi Liu, Yinruo Hua, Matej Jusup

TL;DR
This paper introduces Structurally Decomposed MDPs that leverage causal disentanglement to improve computational efficiency and policy performance in resource allocation problems, especially when integrated with Monte Carlo Tree Search.
Contribution
It proposes a novel SD-MDP framework that exploits causal structure for dimensionality reduction and efficiency, and demonstrates its integration with MCTS for better decision-making.
Findings
Outperforms traditional methods with $O(T \, \log T)$ complexity
Achieves higher rewards with limited simulations
Demonstrates superior results in logistics and finance domains
Abstract
Markov Decision Processes (MDPs), as a general-purpose framework, often overlook the benefits of incorporating the causal structure of the transition and reward dynamics. For a subclass of resource allocation problems, we introduce the Structurally Decomposed MDP (SD-MDP), which leverages causal disentanglement to partition an MDP's temporal causal graph into independent components. By exploiting this disentanglement, SD-MDP enables dimensionality reduction and computational efficiency gains in optimal value function estimation. We reduce the sequential optimization problem to a fractional knapsack problem with log-linear complexity , outperforming traditional stochastic programming methods that exhibit polynomial complexity with respect to the time horizon . Additionally, SD-MDP's computational advantages are independent of state-action space size, making it viable for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Fault Detection and Control Systems
