Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction
Yunhyeok Kwak, Inwoo Hwang, Dooyoung Kim, Sanghack Lee, Byoung-Tak, Zhang

TL;DR
This paper introduces a novel on-the-fly state-conditioned action abstraction for Monte Carlo Tree Search, significantly improving efficiency in large, factored action spaces by dynamically reducing the search space without prior environment models.
Contribution
It proposes a new compositional action abstraction method that learns latent dynamics and infers sub-actions from high-dimensional observations during search.
Findings
Outperforms vanilla MuZero in sample efficiency
Reduces search space by discarding redundant sub-actions
Effective in large, factored action spaces
Abstract
Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Time Series Analysis and Forecasting · Generative Adversarial Networks and Image Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Residual Connection · Average Pooling · Residual Block · Prioritized Experience Replay · Monte-Carlo Tree Search · Convolution · MuZero
