Continual Depth-limited Responses for Computing Counter-strategies in Sequential Games
David Milec, Ond\v{r}ej Kub\'i\v{c}ek, Viliam Lis\'y

TL;DR
This paper introduces methods that combine limited look-ahead game solving with opponent modeling to compute real-time, robust responses in large sequential games, improving over existing approaches and outperforming state-of-the-art methods.
Contribution
It proposes novel algorithms that integrate limited look-ahead solving with opponent models to approximate best responses and compute robust strategies with theoretical guarantees.
Findings
Algorithms outperform baselines in small games
Methods outperform state-of-the-art against SlumBot
Theoretical guarantees established for proposed methods
Abstract
In zero-sum games, the optimal strategy is well-defined by the Nash equilibrium. However, it is overly conservative when playing against suboptimal opponents and it can not exploit their weaknesses. Limited look-ahead game solving in imperfect-information games allows defeating human experts in massive real-world games such as Poker, Liar's Dice, and Scotland Yard. However, since they approximate Nash equilibrium, they tend to only win slightly against weak opponents. We propose methods combining limited look-ahead solving with an opponent model in order to 1) approximate a best response in large games or 2) compute a robust response with control over the robustness of the response. Both methods can compute the response in real time to previously unseen strategies. We present theoretical guarantees of our methods. We show that existing robust response methods do not work combined with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics
