Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling
Marc Ponsen, Steven de Jong, Marc Lanctot

TL;DR
This paper introduces sampling-based algorithms for approximating Nash equilibria and computing robust best-response strategies in complex imperfect information games like Poker, combining Monte-Carlo methods with strategic robustness.
Contribution
It applies Monte-Carlo Tree Search and MCCFR to Poker, and proposes MCRNR, a new sampling algorithm for robust best responses that exploits non-NE opponents effectively.
Findings
MCTS finds reasonably strong strategies quickly in Poker.
MCCFR converges to NE in Poker with theoretical guarantees.
MCRNR learns robust best-response strategies faster than standard RNR.
Abstract
This article discusses two contributions to decision-making in complex partially observable stochastic games. First, we apply two state-of-the-art search techniques that use Monte-Carlo sampling to the task of approximating a Nash-Equilibrium (NE) in such games, namely Monte-Carlo Tree Search (MCTS) and Monte-Carlo Counterfactual Regret Minimization (MCCFR). MCTS has been proven to approximate a NE in perfect-information games. We show that the algorithm quickly finds a reasonably strong strategy (but not a NE) in a complex imperfect information game, i.e. Poker. MCCFR on the other hand has theoretical NE convergence guarantees in such a game. We apply MCCFR for the first time in Poker. Based on our experiments, we may conclude that MCTS is a valid approach if one wants to learn reasonably strong strategies fast, whereas MCCFR is the better choice if the quality of the strategy is most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Artificial Intelligence in Games · Reinforcement Learning in Robotics
