Boltzmann-based Exploration for Robust Decentralized Multi-Agent Planning (Extended Version)
Nhat D. A. Nguyen, Duong D. Nguyen, Gianluca Rizzo, Hung X. Nguyen

TL;DR
This paper introduces CB-MCTS, a novel Boltzmann-based exploration method for decentralized multi-agent planning, which improves exploration in sparse reward environments and outperforms existing methods in deceptive scenarios.
Contribution
It presents the first application of Boltzmann exploration in multi-agent MCTS, enhancing robustness and exploration effectiveness in cooperative planning.
Findings
CB-MCTS outperforms Dec-MCTS in deceptive environments.
CB-MCTS remains competitive on standard benchmarks.
The method provides a robust solution for multi-agent planning.
Abstract
Decentralized Monte Carlo Tree Search (Dec-MCTS) is widely used for cooperative multi-agent planning but struggles in sparse or skewed reward environments. We introduce Coordinated Boltzmann MCTS (CB-MCTS), which replaces deterministic UCT with a stochastic Boltzmann policy and a decaying entropy bonus for sustained yet focused exploration. While Boltzmann exploration has been studied in single-agent MCTS, applying it in multi-agent systems poses unique challenges. CB-MCTS is the first to address this. We analyze CB-MCTS in the simple-regret setting and show in simulations that it outperforms Dec-MCTS in deceptive scenarios and remains competitive on standard benchmarks, providing a robust solution for multi-agent planning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Artificial Intelligence in Games
