Combinatorial Multi-armed Bandits for Real-Time Strategy Games
Santiago Onta\~n\'on

TL;DR
This paper introduces a sampling strategy called na"ive sampling based on combinatorial multi-armed bandits to improve Monte Carlo Tree Search in real-time strategy games with large branching factors.
Contribution
It provides a theoretical analysis of na"ive sampling variants and demonstrates their effectiveness in RTS games with large branching factors.
Findings
Na"ive sampling outperforms other strategies as branching factor increases.
Theoretical properties of na"ive sampling variants are analyzed.
Empirical results show improved performance in RTS game scenarios.
Abstract
Games with large branching factors pose a significant challenge for game tree search algorithms. In this paper, we address this problem with a sampling strategy for Monte Carlo Tree Search (MCTS) algorithms called {\em na\"{i}ve sampling}, based on a variant of the Multi-armed Bandit problem called {\em Combinatorial Multi-armed Bandits} (CMAB). We analyze the theoretical properties of several variants of {\em na\"{i}ve sampling}, and empirically compare it against the other existing strategies in the literature for CMABs. We then evaluate these strategies in the context of real-time strategy (RTS) games, a genre of computer games characterized by their very large branching factors. Our results show that as the branching factor grows, {\em na\"{i}ve sampling} outperforms the other sampling strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Artificial Intelligence in Games · Reinforcement Learning in Robotics
