Mxplainer: Explain and Learn Insights by Imitating Mahjong Agents
Lingfeng Li, Yunlong Lu, Yongyi Wang, Qifan Zheng, Wenxin Li

TL;DR
Mxplainer is a novel method that explains Mahjong AI agents by converting their decision processes into neural networks, achieving high prediction accuracy and providing interpretable insights into strategies and individual moves.
Contribution
Introduces Mxplainer, a parameterized search algorithm that learns to interpret black-box Mahjong agents by converting them into neural networks for better explainability.
Findings
Achieves over 92% top-three accuracy in predicting human actions.
Outperforms decision-tree methods with 34.8% accuracy.
Provides faithful and interpretable explanations of agent strategies.
Abstract
People need to internalize the skills of AI agents to improve their own capabilities. Our paper focuses on Mahjong, a multiplayer game involving imperfect information and requiring effective long-term decision-making amidst randomness and hidden information. Through the efforts of AI researchers, several impressive Mahjong AI agents have already achieved performance levels comparable to those of professional human players; however, these agents are often treated as black boxes from which few insights can be gleaned. This paper introduces Mxplainer, a parameterized search algorithm that can be converted into an equivalent neural network to learn the parameters of black-box agents. Experiments on both human and AI agents demonstrate that Mxplainer achieves a top-three action prediction accuracy of over 92% and 90%, respectively, while providing faithful and interpretable approximations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multi-Agent Systems and Negotiation · Natural Language Processing Techniques
