Improving Bidding and Playing Strategies in the Trick-Taking game Wizard using Deep Q-Networks
Jonas Schumacher, Marco Pleines

TL;DR
This paper develops deep reinforcement learning agents for the trick-taking game Wizard, modeling it as POMDPs, and explores various strategies including LSTM integration and tree search, achieving significant accuracy improvements over baselines.
Contribution
It introduces a novel application of Deep Q-Networks to Wizard, incorporating LSTM and tree search to handle imperfect information, and provides a comprehensive analysis of their effectiveness.
Findings
DQN agents achieve 66-87% accuracy in self-play.
LSTM and tree search methods do not outperform basic DQN.
Significant information asymmetry observed during bidding.
Abstract
In this work, the trick-taking game Wizard with a separate bidding and playing phase is modeled by two interleaved partially observable Markov decision processes (POMDP). Deep Q-Networks (DQN) are used to empower self-improving agents, which are capable of tackling the challenges of a highly non-stationary environment. To compare algorithms between each other, the accuracy between bid and trick count is monitored, which strongly correlates with the actual rewards and provides a well-defined upper and lower performance bound. The trained DQN agents achieve accuracies between 66% and 87% in self-play, leaving behind both a random baseline and a rule-based heuristic. The conducted analysis also reveals a strong information asymmetry concerning player positions during bidding. To overcome the missing Markov property of imperfect-information games, a long short-term memory (LSTM) network is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Gambling Behavior and Treatments · Digital Games and Media
MethodsWizard: Unsupervised goats tracking algorithm · Convolution · Dense Connections · Q-Learning · Deep Q-Network
