# Learning Policies from Human Data for Skat

**Authors:** Douglas Rebstock, Christopher Solinas, Michael Buro

arXiv: 1905.10907 · 2019-05-28

## TL;DR

This paper develops a deep learning approach to learn policies for the card game Skat from human data, achieving a new state-of-the-art in bidding and declaration with faster execution than search-based methods.

## Contribution

It introduces methods to vary bidding aggressiveness and declare games based on expected value, advancing model-free policy learning for Skat from human data.

## Key findings

- Achieved state-of-the-art bidding and declaration performance.
- Policies run significantly faster than search-based methods.
- Explored reinforcement learning and human data integration for policy improvement.

## Abstract

Decision-making in large imperfect information games is difficult. Thanks to recent success in Poker, Counterfactual Regret Minimization (CFR) methods have been at the forefront of research in these games. However, most of the success in large games comes with the use of a forward model and powerful state abstractions. In trick-taking card games like Bridge or Skat, large information sets and an inability to advance the simulation without fully determinizing the state make forward search problematic. Furthermore, state abstractions can be especially difficult to construct because the precise holdings of each player directly impact move values.   In this paper we explore learning model-free policies for Skat from human game data using deep neural networks (DNN). We produce a new state-of-the-art system for bidding and game declaration by introducing methods to a) directly vary the aggressiveness of the bidder and b) declare games based on expected value while mitigating issues with rarely observed state-action pairs. Although cardplay policies learned through imitation are slightly weaker than the current best search-based method, they run orders of magnitude faster. We also explore how these policies could be learned directly from experience in a reinforcement learning setting and discuss the value of incorporating human data for this task.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.10907/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1905.10907/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1905.10907/full.md

---
Source: https://tomesphere.com/paper/1905.10907