Survey of Artificial Intelligence for Card Games and Its Application to   the Swiss Game Jass

Joel Niklaus; Michele Alberti; Vinaychandran Pondenkandath; Rolf; Ingold; Marcus Liwicki

arXiv:1906.04439·cs.AI·June 12, 2019

Survey of Artificial Intelligence for Card Games and Its Application to the Swiss Game Jass

Joel Niklaus, Michele Alberti, Vinaychandran Pondenkandath, Rolf, Ingold, Marcus Liwicki

PDF

2 Repos

TL;DR

This paper surveys AI techniques for card games, focusing on their application to Swiss Jass, highlighting current methods, challenges, and potential for future research in this culturally significant game.

Contribution

It provides the first comprehensive overview of AI methods for Jass and discusses their adaptation, serving as a starting point for researchers interested in this specific game.

Findings

01

AI agents currently do not outperform top human players in Jass

02

Overview of state-of-the-art AI methods for card games

03

Discussion of challenges and future directions for Jass AI

Abstract

In the last decades we have witnessed the success of applications of Artificial Intelligence to playing games. In this work we address the challenging field of games with hidden information and card games in particular. Jass is a very popular card game in Switzerland and is closely connected with Swiss culture. To the best of our knowledge, performances of Artificial Intelligence agents in the game of Jass do not outperform top players yet. Our contribution to the community is two-fold. First, we provide an overview of the current state-of-the-art of Artificial Intelligence methods for card games in general. Second, we discuss their application to the use-case of the Swiss card game Jass. This paper aims to be an entry point for both seasoned researchers and new practitioners who want to join in the Jass challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Survey of Artificial Intelligence for Card Games and Its Application to the Swiss Game Jass

Joel Niklaus12, and Michele Alberti12, and Vinaychandran Pondenkandath2, and Rolf Ingold2, and Marcus Liwicki24 1 Both authors contributed equally to this work.

2Document Image and Voice Analysis Group (DIVA)

University of Fribourg, Switzerland

{firstname}.{lastname}@unifr.ch

4Machine Learning Group

Luleå University of Technology, Sweden

[email protected]

Abstract

In the last decades we have witnessed the success of applications of Artificial Intelligence to playing games. In this work we address the challenging field of games with hidden information and card games in particular. Jass is a very popular card game in Switzerland and is closely connected with Swiss culture. To the best of our knowledge, performances of Artificial Intelligence agents in the game of Jass do not outperform top players yet. Our contribution to the community is two-fold. First, we provide an overview of the current state-of-the-art of Artificial Intelligence methods for card games in general. Second, we discuss their application to the use-case of the Swiss card game Jass. This paper aims to be an entry point for both seasoned researchers and new practitioners who want to join in the Jass challenge.

aka also known as NE Nash Equilibrium API Application Programming Interface AI Artificial Intelligence AGI Artificial General Intelligence RL Reinforcement Learning Dota Defense of the Ancients PIG Perfect Information Game IIG Imperfect Information Game MC Monte Carlo MCTS Monte Carlo Tree Search IS-MCTS Information Set Monte Carlo Tree Search UCT Upper Confidence Bound for Trees UCB Upper Confidence Bound NFSP Neural Fictitious Self-Play FOM First Order Methods EGT Excessive Gap Technique CFR Counterfactual Regret Minimization MCCFR Monte Carlo Sampling for Regret Minimization OOS Online Outcome Sampling PG Policy Gradient PPO Proximal Policy Optimization A2C Advantage Actor Critic TDL Temporal Difference Learning KNN K-Nearest Neighbour EA Evolutionary Algorithm MLP Multilayer Perceptron ANN Artificial Neural Network

I Introduction

The research field of Artificial Intelligence (AI) applied to playing games has been subject to several breakthroughs in the last years. In particular, the branch of Perfect Information Games — where the entire game state is known to all players at all points in time — has seen machines triumph over human professional players in different occasions, such as for Chess, the Atari games or Go. When it comes to Imperfect Information Games — where part of the information is unknown to the players, such as in card games — there is a thin line separating AI from humans, who still have the upper hand against state-of-the-art agents. However, recent work shows that in constrained situations, the gap between humans and AI is becoming thinner. This is particularly visible when considering advances on Texas hold’em no-limit poker [1] and the computer games Defense of the Ancients (Dota) 2 and StarCraft II.

Hidden information is also present in many real world scenarios, like negotiations, surgical operations, business, physics and others. Many of these situations can be formalized as games, which in turn can be solved using the methods refined in the test bed of card games. Most card games involve hidden information, which makes them both a suitable and interesting domain for further research on AI. There is a large variety of card games, where many use different cards and rules, which poses different challenges to the players. To tackle these different issues, several methods have been proposed. Unfortunately, these methods are often either very complex, or introduce only minor modifications to address a particular issue for a particular game. Despite producing good empirical results, this practice leads to a more complex landscape of literature which is at times hard to navigate, especially for new practitioners in the field. To combat this unwanted side effect, overviews of the current recent trends and methods are very helpful. In this work, we aim to provide such an overview of AI methods applied to card games. In the appendix there is a short description of the games we mention in this work.

To complement the overview, we chose to use the card game “Jass” as a use-case for a discussion of the methods present above. Jass is a very popular card game in Switzerland and tightly linked to the Swiss culture. From a research point of view, Jass is a challenging game because a) it is played by more than two players (specifically, four divided in two teams of two), b) it involves hidden information (the cards of the other players), c) it is difficult to master by humans and d) the number of information sets is much bigger than that of other popular card games such as Poker. However to the best of our knowledge, a formal approach towards Jass has not been address in a scientific manner yet. The Swiss Intercantonal Lottery and some Jass applications have deployed some AI agents, but these programs are not yet able to beat top human players.

Main Contribution

In this work, we aim to address a gap in the literature regarding AI approaches towards card games with a particular emphasis on the popular Swiss card game Jass. To the best of our knowledge, there has not been a formal scientific approach to Jass outlined in the literature. To this end, we discuss the potential merits and demerits of the different methods outlined in the paper towards Jass.

II Related Work

In this section we review the relevant related work. In their book, Yannakakis et al. [2] gave a general overview of AI development in games, while Rubin et al. [3] provided a more specific review on the methods used in computer Poker. In his thesis, Burch [4] reviewed the state-of-the-art in Counterfactual Regret Minimization (CFR), a family of methods very heavily used in computer Poker. Finally, Browne et al. [5] surveyed the different variants of Monte Carlo Tree Search (MCTS), a family of methods used for AIs in many games of both perfect and imperfect information. We are not aware of any work that specifically addresses the domain of card games.

III Theoretical Foundation

In this section we introduce terms necessary to understand AI for card games.

III-A Game Types

Games can be classified in many dimensions. In this section we outline the ones most important for classifying card games.

III-A1 Extensive-form Games

Sequential games are normally formalized as extensive-form games. These games are played on a decision tree, where a node represents a decision point for a player and an edge describes a possible action leading from one state to another. For each node in this tree it is possible to define an information set. An information set includes all the states a player could be in, given the information the player has observed so far. In PIGs, these information sets always only comprise exactly one state, because all information is known. In an IIG like Poker, this information set contains all the card combinations the opponents could have, given the information the player has, i.e. the cards on the table and the cards in the hand.

III-A2 Coordination Games

Unlike many strategic situations, collaboration is central in a coordination game, not conflict. In a coordination game, the highest payoffs can only be achieved through team work. Choosing the side of the road to drive on is a simple example of a coordination game. It does not matter which side of the road you agree on, but to avoid crashes, an agreement is essential. In card games, like Bridge or Jass, where there are two teams playing against each other, the interactions within the team can be seen as a coordination game.

III-B AI Performance Evaluation

When developing an AI, it is important to accurately measure its strength in comparison to other AIs and humans. The ultimate goal is to achieve optimal play. When a player is playing optimally, s/he does not make any mistakes but plays the best possible move in every situation. When an optimal strategy in a game is known, this game is considered solved.

III-B1 Nash Equilibrium

A Nash Equilibrium (NE) describes a combination of strategies in non-cooperative games. When two or more players are playing their respective part of a NE, any unilateral deviation from the strategy leads to a negative relative outcome for the deviating player [6]. So when programming players for games, the goal is to get as close as possible to a NE. When one is playing a NE strategy, the worst outcome that can happen is coming to a draw. This means that a NE player wins against any player not playing a NE strategy. In games involving chance (the cards dealt at the beginning in the case of Poker), the player may not win every single game. Thus, many games may have to be played to evaluate the strategies. A NE strategy is particularly beneficial against strong players. Therefore, it does not make any mistakes the opponent could possibly exploit. On the other hand, a NE strategy might not win over a sub-optimal player by a large margin because it does not actively try to exploit the opponent but rather tries not to commit any mistakes at all. There exists a NE for every finite game [6].

III-B2 Exploitability

Exploitability is a measure for this deviation from a NE [7]. The higher the exploitability, the greater the distance to a NE, and therefore, the weaker the player. A NE strategy constitutes optimal play, since there is no possible better strategy. However, there are different NE strategies which differ in their effectiveness of exploiting non-NE strategies [8]. If it is not possible to calculate such a strategy (for example, because the state space is too large), we want to estimate a strategy which minimizes the deviation from a NE.

III-B3 Comparison to Humans

When designing AIs it is always interesting to evaluate how well they perform in comparison to humans. Here we distinguish four categories: sub-human, par-human, high-human and super-human AI which respectively mean worse than, similar to, better than most and better than all humans. The current best AI agents in Jass achieve par-human standards. In Bridge, current computer programs achieve expert level, which constitutes high-human proficiency. In many PIGs like Go or Chess, current AIs achieve super-human level.

IV Rule-Based Systems

Rule-based systems leverage human knowledge to build an AI player [2]. Many simple AIs for card games are rule-based and then used as baseline players. This mostly entails a number of if-then-else statements which can be viewed as a man-made decision tree.

Ward et al. [9] created a rule-based AI for Magic: The Gathering which was used as a baseline player. Robilliard et al. [10] developed a rule-based AI for 7 Wonders which was used as a baseline player. Watanabe et al. [11] implemented three rule-based players. The greedy player behaves like a beginner player. The other two follow more advanced strategies taken from strategy books and are behaving like expert players. Osawa [12] presented several par-human rule-based strategies for Hanabi. His results indicated that feedback-based strategies achieve higher scores than purely rational ones. Van den Bergh et al. [13] developed a strong par-human rule-based AI for Hanabi. Whitehouse et al. [14] evaluated the rule-based Spades player developed by AI Factory. Based on player reviews they found it to decide weakly in certain situations but to be a strong par-human player overall.

V Reinforcement Learning Methods

Reinforcement Learning (RL) is a machine learning method which is frequently used to play games. It consists of an agent performing actions in a given environment. Based on its actions, the agent receives positive rewards which reinforce desirable behaviour and negative rewards which discourage unwanted behaviour. Using a value function, the agent tries to find out which action is the most desirable in a given state.

V-A Temporal Difference Learning

Temporal Difference Learning (TDL) updates the value function continuously after every iteration, as opposed to earlier strategies which waited until the episode’s end [15].

Sturtevant et al. [16] developed a sub-human AI for Hearts using Stochastic Linear Regression and TDL which outperforms players based on minimax search.

V-B Policy Gradient

Policy Gradient (PG) is an algorithm which directly learns a policy function mapping a state to an action [15]. Proximal Policy Optimization (PPO) is an extension to the PG algorithm improving its stability and reducing the convergence time [17]

Charlesworth [18] applied PPO to Big 2, reaching par-human level.

V-C Counterfactual Regret Minimization

CFR [19] is a self-playing method that works very well for IIGs and has been used by the most successful poker AIs [1, 20]. “Counterfactual” denotes looking back and thinking “had I only known then…”. “Regret” says how much better one would have done, if one had chosen a different action. And “minimization” is used to minimize the total regret over all actions, so that the future regret is as small as possible. Note that CFR only requires memory linear to the number of information sets and not to the number of states [3]. Additionally, CFR has been able to exploit non-NE strategies computed by Upper Confidence Bound for Trees (UCT) agents in simultaneous games [21].

V-C1 Counterfactual Regret Minimization+

CFR+ is a re-engineered version of CFR, which drastically reduces convergence time. It always iterates over the entire tree and only allows non-negative regrets. [22] Bowling et al. [22] used CFR+ to essentially solve heads-up limit Texas hold’em Poker in 2015.

Moravčík et al. [20] developed a general algorithm for imperfect information settings, called DeepStack. With statistical significance, it defeated professional poker players in a study over 44000 hands.

V-C2 Deep Counterfactual Regret Minimization

Deep CFR [23] combines CFR with deep Artificial Neural Networks. Brown et al. [1] leverage deep CFR to decisively beat four top human poker players in 2017 with their program called Libratus.

V-C3 Discounted Counterfactual Regret Minimization

Discounted CFR [24] matches or outperforms the previous state-of-the-art variant CFR+ depending on the application by discounting prior iterations.

V-D Neural Fictitious Self-Play

In Neural Fictitious Self-Play (NFSP), two players start with random strategies encoded in an ANN. They play against each other knowing the other player’s strategy improving the own strategy. With an increasing number of iterations, the strategies typically approach a NE. Since NFSP [25] has a slower convergence rate than CFR it is not widely used.

Heinrich et al. [25] applied NFSP to Texas hold’em Poker and reported similar performance to the state-of-the-art super-human programs. In Leduc Poker, a simplification of the former, they approached a NE. Kawamura et al. [26] calculated approximate NE strategies with NFSP in multiplayer IIGs.

V-E First Order Methods

First Order Methods (FOM) like Excessive Gap Technique (EGT) are, like CFR, methods which approximate NE strategies in IIGs. They have a better theoretical convergence rate than CFR because of lower computational and memory costs. Note that, like CFR, EGT is only able to approach a NE in two-player games [27].

Kroer et al. [27] applied a variant of EGT to Poker reporting faster convergence than some CFR variants. They argue that, given more hyper parameter tuning, the performance of CFR+ can be reached.

VI Monte Carlo Methods

Monte Carlo (MC) methods use randomness to solve problems that might be deterministic in principle.

VI-A Monte Carlo* Simulation*

MC Simulation uses a large number of random experiments to numerically solve large problems involving many random variables.

Mitsukami et al. [28] developed a par-human AI for Japanese Mahjong using MC Simulation. Kupferschmid et al. [29] applied MC Simulation to Skat to obtain the game-theoretical value of a Skat hand. Note that they converted the game to a PIG by making all the cards known. Yan et al. [30] report a 70% win rate using MC Simulation in a Klondike version, which has all cards revealed to the player. Note that this converts the game to a PIG.

VI-B Flat Monte Carlo

Flat MC uses MC Simulation, with the actions in a given state being uniformly sampled [5].

Ginsberg [31] achieves world champion level play in Bridge using Flat MC in 2001.

VI-C Monte Carlo Tree Search

MCTS consists of four stages: Selection, Expansion, Simulation and Backpropagation [5]. Selection: Starting from the root node, an expandable child node is selected. A node is expandable if it is non-terminal (i.e. it does have children) and has unvisited children.

Expansion: The tree is expanded by adding one or more child nodes to the previously selected node.

Simulation: From these new children nodes a simulation is run to acquire a reward at a terminal node.

Backpropagation: The simulation’s result is used to update the information in the affected nodes (nodes in the selection path). A tree policy is used for selecting and expanding a node and the simulation is run according to the default policy.

Browne et al. [5] gives a detailed overview of the MCTS family . In this section we outline the variants used on card games.

VI-C1 Upper Confidence Bound for Trees

UCT is the most common MCTS method, using upper confidence bounds as a tree policy, which is a formula that tries to balance the exploration/exploitation problem [32]. When the search explores too much, the optimal moves are not played frequently enough and therefore it may find a sub-optimal move. When the search exploits too much, it may not find a path which promises much greater payoffs and it therefore also may find a sub-optimal move. Minimax is a basic algorithm used for two-player zero-sum games, operating on the game tree. When the entire tree is visited, minimax is optimal [2]. UCT converges to minimax given enough time and memory [32].

Sievers et al. [33] applied UCT to Doppelkopf reaching par-human performance. Schäfer [34] used UCT to build an AI for Skat, which is still sub-human but comparable to the MC Simulation based player proposed by Kupferschmid et al. [29]. Swiechowski et al. [35] combined an MCTS player with supervised learning on the logs of sample games, achieving par-human performance. Santos et al. [36] outperformed basic MCTS based AIs by combining it with domain-specific knowledge. Heinrich et al. [37] combined UCT with self-play and apply it to Poker. They reported convergence to a NE in a small Poker game and argue that, given enough training, convergence can also be reached in large limit Texas Hold’em Poker.

VI-C2 Determinization

Determinization is a technique which allows solving an IIG with methods used for PIGs. Determinization samples many states from the information set and plays the game to a terminal state based on these states of perfect information.

Bjarnason et al. [38] studied Klondike using UCT, hindsight optimization and sparse sampling. Hindsight optimization uses determinization and hindsight knowledge to improve the strategy. They developed a policy which wins at least 35% of games, which is a lower bound for an optimal Klondike policy. Sturtevant [39] applied UCT with determinization to the multiplayer games Spades and Hearts. He reported similar performance to the state-of-the-art at that time in Spades and slightly better performance in Hearts. Cowling et al. [40] applied MCTS with determinization approaches to the card game Magic: The Gathering achieving high-human performance and outperforming an expert-level rule-based player. Robilliard et al. [10] applied UCT with determinization to 7 Wonders outperforming rule-based AIs. The experiments against human players were promising but not statistically significant. Solinas et al. [41] used UCT and supervised learning to infer the cards of the other players, improving over the state-of-the-art in Skat card-play. Edelkamp [42] combined distilled expert rules, winning probabilities aggregations and a fast tree exploration into an AI for the Misère variant of Skat significantly outperforming human experts.

VI-C3 Information Set Monte Carlo Tree Search

Information Set Monte Carlo Tree Search (IS-MCTS) tackles the problem of strategy fusion which includes the false assumption that different moves can be taken from different states in the information set [43]. However, because the player does not know of the different states in the information set, it cannot decide differently, based on different states. IS-MCTS operates directly on a tree of information sets.

Whitehouse et al. [44] used MCTS with determinization and information sets on Dou Di Zhu. They did not report any significant differences in performance between the two proposed algorithms. Watanabe et al. [11] presented a high-human AI using IS-MCTS for the Italian card game Scopone which consistently beat strong rule-based players. Walton-Rivers et al. [45] applied IS-MCTS to Hanabi, but they measured inferior performance to rule-based players. Whitehouse et al. [14] found an MCTS player to be stronger than rule-based players in the card game Spades. They integrated IS-MCTS with knowledge-based methods to create more engaging play. Cowling et al. [46] performed a statistical analysis over 27592 played games on a mobile platform to evaluate the player’s difficulty for humans. Devlin et al. [47] combined insights from game play data with IS-MCTS to emulate human play.

VI-D Monte Carlo Sampling for Regret Minimization

Monte Carlo Sampling for Regret Minimization (MCCFR) drastically reduces the convergence time of CFR by using MC Sampling [48]. MCCFR samples blocks of paths from the root to a terminal node and then computes the immediate counterfactual regrets over these blocks.

Lanctot et al. [48] showed this faster convergence rate in experiments on Goofspiel and One-Card-Poker. Ponsen et al. [49] evidences that MCCFR approaches a NE in Poker.

VI-D1 Online Outcome Sampling

Online Outcome Sampling (OOS) is an online variant to MCCFR which can decrease its exploitability with increasing search time [50].

Lisý et al. [50] demonstrated that OOS can exploit IS-MCTS in Poker knowing the opponent’s strategy and given enough computation time.

VII Evolutionary Algorithms

Evolutionary Algorithms are inspired by evolutionary theory. Strong individuals — strategies in the case of game AIs — can survive and reproduce, whereas weaker ones eventually become extinct [2].

Mahlmann et al. [51] compared three EA agents with different fitness functions in Dominion. They argued that their method can be used for automatic game design and game balancing. Noble [52] applied a EA evolving ANNs to Poker in 2002 improving over the state-of-the-art at the time.

VIII Use-Case: Swiss Card Game Jass

Jass is a trick-taking traditional Swiss card game often played at social events. It involves hidden information, is sequential, non-cooperative, finite and constant-sum, as there are always 157 points possible in each game. The Swiss Intercantonal Lottery provide a guide for general Jass rules111www.swisslos.ch/en/jass/informations/jass-rules/principles-of-jass.html and for the variant Schieber in particular222www.swisslos.ch/en/jass/informations/jass-rules/schieber-jass.html.

VIII-A Coordination Game Within Jass

Schieber is a non-cooperative game, since the two teams are opposing each other. However, additionally, the activity within a team can be formulated as a coordination game. This adds another dimension to the game as it enables cooperation between the players within the game to maximize the team’s benefit. Although the rules of the game forbid any communication during a game within the team, by playing specific cards in certain situations, the two players can convey information about the cards they have. For this to work, of course they must have the same understanding of this communication by card play. Humans have some existing “agreements” like a “discarding policy”. Discarding tells the partner which suits the player is bad at. It is interesting to investigate, whether AIs are able to pick up these “agreements” or even come up with new ones.

VIII-B Suitable Methods for AIs in Card Games by the Example of Jass

MCTS and CFR are the two families of algorithms that have most successfully been applied to card games. In this section we are comparing these two methods’ advantages and disadvantages in detail by the example of the trick-taking card game Jass.

To the best of our knowledge, CFR has almost exclusively been applied to Poker so far, although the authors claim that it can be applied to any IIG [23]. CFR provides theoretical guarantees for approaching a NE in two player IIGs [19]. On the other hand, as we discussed in section III-B1, pure NE strategies may not be able to specifically exploit weak opponents. Additionally, CFR needs a lot of time to converge, compared to MCTS [49].

MCTS has been applied to a plethora of complex card games including Bridge, Skat, Doppelkopf or Spades, as we have illustrated in the previous sections. It finds good strategies fast but only converges to a NE in PIGs and not necessarily in IIGs [49]. As opposed to CFR, MCTS does not find the moves with lowest exploitability, but the ones with highest chance of winning [53]. MCTS eventually converges to minimax, but total convergence is infeasible for large problems [5].

So, if the goal is to find a good strategy relatively fast, MCTS should be chosen, whereas CFR should be selected, if the goal is to be minimally exploitable [49]. To put it simply, CFR is great at not losing, but not very good at destroying an opponent and MCTS is great at finding good strategies fast, but not very good at resisting against very strong opponents.

VIII-C Preliminary Results

Preliminary experiments not presented in this paper show that MCTS is a promising approach for a strong AI playing Jass.

IX Conclusion

In this paper we first provided an overview of the methods used in AI development for card games. Then we discussed the advantages and disadvantages of the two most promising families of algorithms (MCTS and CFR) in more detail. Finally, we presented an analysis for how to apply these methods to the Swiss card game Jass.

Appendix A Game Descriptions

In this section we give the gist of the less well-known games discussed in the paper (in order of appearance).

Magic: The Gathering is a trading and digital collectible card game played by two or more players. 7 Wonders is a board game with strong elements of card games including hidden information for two to seven players. Scopone is a variant of the Italian card game Scopa. Hanabi is a French cooperative card game for two to five players. Spades is a four player trick-taking card game mainly played in North America. Big 2 is a Chinese card game for two to four players mainly played in East and South East Asia. The goal is, to get rid of all of one’s cards first. Mahjong is a traditional Chinese tile-based game for four (or seldom three) players similar to the Western game Rummy. Skat is a three player trick-taking card game mainly played in Germany. Klondike is a single-player variant of the French card game Patience and shipped with Windows since version 3. Bridge is a trick-taking card game for four players played world-wide in clubs, tournaments, online and socially at home. It has often been used as a test bed for AI research and is still an active area of research, since super-human performance has not been achieved yet. Doppelkopf is a trick-taking card game for four people, mainly played in Germany. Hearthstone is an online collectible card video game, developed by Blizzard Entertainment. Hearts is a four player trick-taking card game, mainly played in North America. Dou Di Zhu is a Chinese card game for three players. Goofspiel is a simple bidding card game for two or more players. One-Card-Poker generalizes the minimal variant Kuhn-Poker. Dominion is a modern deck-building card game similar to Magic: The Gathering.

Bibliography53

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. Brown and T. Sandholm, “Superhuman ai for heads-up no-limit poker: Libratus beats top professionals,” Science , 2017.
2[2] G. N. Yannakakis and J. Togelius, Artificial Intelligence and Games . Springer, 2018, http://gameaibook.org .
3[3] J. Rubin and I. Watson, “Computer poker: A review,” Artificial Intelligence , vol. 175, no. 5, pp. 958 – 987, 2011, special Review Issue.
4[4] N. Burch, “Time and space: Why imperfect information games are hard,” Ph.D. dissertation, University of Alberta, 2018.
5[5] C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A survey of monte carlo tree search methods,” IEEE Transactions on Computational Intelligence and AI in Games , vol. 4, no. 1, pp. 1–43, March 2012.
6[6] J. F. Nash, “Non-cooperative games,” Annals of Mathematics , vol. 54, no. 2, pp. 286–295, 1951.
7[7] T. Davis, N. Burch, and M. Bowling, “Using response functions to measure strategy strength,” in Proceedings of the Twenty-Eighth Conference on Artificial Intelligence (AAAI) , 2014, pp. 630–636.
8[8] J. Cermak, B. Bosansky, and V. Lisý, “Practical performance of refinements of nash equilibria in extensive-form zero-sum games,” Frontiers in Artificial Intelligence and Applications , vol. 263, 08 2014.