Monte-Carlo tree search with uncertainty propagation via optimal transport
Tuan Dam, Pascal Stenger, Lukas Schneider, Joni Pajarinen, Carlo, D'Eramo, Odalric-Ambrym Maillard

TL;DR
This paper presents a new Monte-Carlo Tree Search method that propagates uncertainty using Wasserstein barycenters, improving decision-making in stochastic and partially observable environments.
Contribution
It introduces a novel backup operator based on Wasserstein barycenters and combines it with sampling strategies, providing theoretical convergence guarantees and empirical improvements.
Findings
Outperforms existing baselines in stochastic environments
Provides theoretical guarantees of convergence
Effectively propagates uncertainty across the search tree
Abstract
This paper introduces a novel backup strategy for Monte-Carlo Tree Search (MCTS) designed for highly stochastic and partially observable Markov decision processes. We adopt a probabilistic approach, modeling both value and action-value nodes as Gaussian distributions. We introduce a novel backup operator that computes value nodes as the Wasserstein barycenter of their action-value children nodes; thus, propagating the uncertainty of the estimate across the tree to the root node. We study our novel backup operator when using a novel combination of -Wasserstein barycenter with -divergence, by drawing a notable connection to the generalized mean backup operator. We complement our probabilistic backup operator with two sampling strategies, based on optimistic selection and Thompson sampling, obtaining our Wasserstein MCTS algorithm. We provide theoretical guarantees of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Markov Chains and Monte Carlo Methods
MethodsMonte-Carlo Tree Search
