The Chess Transformer: Mastering Play using Generative Language Models

David Noever; Matt Ciolino; Josh Kalin

arXiv:2008.04057·cs.AI·September 21, 2020·23 cites

The Chess Transformer: Mastering Play using Generative Language Models

David Noever, Matt Ciolino, Josh Kalin

PDF

Open Access 2 Repos

TL;DR

This paper introduces a transformer-based model trained on millions of chess games that can generate strategic moves, recognize openings, and support human interaction in chess, bridging language modeling and game strategy.

Contribution

The work demonstrates that language transformers can be adapted to learn and generate complex chess strategies from large game datasets, extending their application beyond natural language.

Findings

01

Transformer generates plausible chess strategies.

02

Model recognizes classic chess openings.

03

Supports human-robot chess interaction.

Abstract

This work demonstrates that natural language transformers can support more generic strategic modeling, particularly for text-archived games. In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard. With further fine-tuning, the transformer learns complex gameplay by training on 2.8 million chess games in Portable Game Notation. After 30,000 training steps, OpenAI's Generative Pre-trained Transformer (GPT-2) optimizes weights for 774 million parameters. This fine-tuned Chess Transformer generates plausible strategies and displays game formations identifiable as classic openings, such as English or the Slav Exchange. Finally, in live play, the novel model demonstrates a human-to-transformer interface that correctly filters illegal moves and provides a novel method to challenge the transformer's chess strategies.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Digital Games and Media · Sports Analytics and Performance

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Attention Is All You Need · Multi-Head Attention · Byte Pair Encoding · Label Smoothing · Dropout · Adam