M2-PALE: A Framework for Explaining Multi-Agent MCTS--Minimax Hybrids via Process Mining and LLMs

Yiyu Qian; Liyuan Zhao; Tim Miller

arXiv:2604.14687·cs.AI·April 17, 2026

M2-PALE: A Framework for Explaining Multi-Agent MCTS--Minimax Hybrids via Process Mining and LLMs

Yiyu Qian, Liyuan Zhao, Tim Miller

PDF

TL;DR

M2-PALE introduces a framework combining process mining and LLMs to explain hybrid MCTS-Minimax agents, improving interpretability of complex decision-making in strategic AI.

Contribution

It integrates process mining with LLMs to generate human-readable explanations of hybrid MCTS-Minimax agents' behavior, enhancing transparency.

Findings

01

Effective in a small-scale checkers environment

02

Scalable approach for interpreting hybrid agents

03

Combines process mining with LLMs for explanations

Abstract

Monte-Carlo Tree Search (MCTS) is a fundamental sampling-based search algorithm widely used for online planning in sequential decision-making domains. Despite its success in driving recent advances in artificial intelligence, understanding the behavior of MCTS agents remains a challenge for both developers and users. This difficulty stems from the complex search trees produced through the simulation of numerous future states and their intricate relationships. A known weakness of standard MCTS is its reliance on highly selective tree construction, which may lead to the omission of crucial moves and a vulnerability to tactical traps. To resolve this, we incorporate shallow, full-width Minimax search into the rollout phase of multi-agent MCTS to enhance strategic depth. Furthermore, to demystify the resulting decision-making logic, we introduce \textsf{M2-PALE} (MCTS--Minimax Process-Aided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.