MALinZero: Efficient Low-Dimensional Search for Mastering Complex Multi-Agent Planning
Sizhe Tang, Jiayu Chen, Tian Lan

TL;DR
MALinZero introduces a low-dimensional representation approach to enhance the efficiency of Monte Carlo Tree Search in complex multi-agent planning, significantly improving exploration, exploitation, and learning speed.
Contribution
The paper presents MALinZero, a novel method that projects joint-action returns into a low-dimensional space using linear bandit formulation, enabling efficient multi-agent MCTS.
Findings
Outperforms existing multi-agent RL baselines on benchmarks
Achieves faster learning and better performance
Demonstrates state-of-the-art results on matrix games and SMAC
Abstract
Monte Carlo Tree Search (MCTS), which leverages Upper Confidence Bound for Trees (UCTs) to balance exploration and exploitation through randomized sampling, is instrumental to solving complex planning problems. However, for multi-agent planning, MCTS is confronted with a large combinatorial action space that often grows exponentially with the number of agents. As a result, the branching factor of MCTS during tree expansion also increases exponentially, making it very difficult to efficiently explore and exploit during tree search. To this end, we propose MALinZero, a new approach to leverage low-dimensional representational structures on joint-action returns and enable efficient MCTS in complex multi-agent planning. Our solution can be viewed as projecting the joint-action returns into the low-dimensional space representable using a contextual linear bandit problem formulation. We solve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Advanced Bandit Algorithms Research
