# On Monte-Carlo tree search for deterministic games with alternate moves   and complete information

**Authors:** Sylvain Delattre, Nicolas Fournier

arXiv: 1704.04612 · 2018-01-25

## TL;DR

This paper introduces an optimal step-by-step Monte-Carlo Tree Search algorithm for deterministic, perfect-information games, aiming to improve minimax value estimation through adaptive match simulation, with theoretical and empirical analysis.

## Contribution

It develops a novel, stepwise optimal MCTS algorithm that adaptively chooses simulations to maximize information gain about the minimax value, and analyzes its convergence and performance.

## Key findings

- Algorithm converges in finite steps regardless of prior
- Numerical results show limited overall improvement over standard MCTS
- Some scenarios indicate potential efficiency of the proposed method

## Abstract

We consider a deterministic game with alternate moves and complete information, of which the issue is always the victory of one of the two opponents. We assume that this game is the realization of a random model enjoying some independence properties. We consider algorithms in the spirit of Monte-Carlo Tree Search, to estimate at best the minimax value of a given position: it consists in simulating, successively, $n$ well-chosen matches, starting from this position. We build an algorithm, which is optimal, step by step, in some sense: once the $n$ first matches are simulated, the algorithm decides from the statistics furnished by the $n$ first matches (and the a priori we have on the game) how to simulate the $(n+1)$-th match in such a way that the increase of information concerning the minimax value of the position under study is maximal. This algorithm is remarkably quick. We prove that our step by step optimal algorithm is not globally optimal and that it always converges in a finite number of steps, even if the a priori we have on the game is completely irrelevant. We finally test our algorithm, against MCTS, on Pearl's game and, with a very simple and universal a priori, on the games Connect Four and some variants. The numerical results are rather disappointing. We however exhibit some situations in which our algorithm seems efficient.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.04612/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1704.04612/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1704.04612/full.md

---
Source: https://tomesphere.com/paper/1704.04612