Learning to Imitate with Less: Efficient Individual Behavior Modeling in Chess

Zhenwei Tang; Difan Jiao; Eric Xue; Reid McIlroy-Young; Jon Kleinberg; Siddhartha Sen; Ashton Anderson

arXiv:2507.21488·cs.AI·July 30, 2025

Learning to Imitate with Less: Efficient Individual Behavior Modeling in Chess

Zhenwei Tang, Difan Jiao, Eric Xue, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson

PDF

3 Reviews

TL;DR

This paper introduces Maia4All, a novel framework that efficiently models individual human decision-making in chess with minimal data, significantly reducing data requirements from thousands to just twenty games.

Contribution

Maia4All presents a two-stage optimization approach that leverages prototypes and ability levels to personalize AI behavior modeling with limited data, advancing human-AI alignment in chess.

Findings

01

Accurately predicts individual chess moves with only 20 games

02

Reduces data requirement from 5,000 to 20 games

03

Demonstrates applicability to personalized LLMs

Abstract

As humans seek to collaborate with, learn from, and better understand artificial intelligence systems, developing AIs that can accurately emulate individual decision-making becomes increasingly important. Chess, a long-standing AI benchmark with precise skill measurement, offers an ideal testbed for human-AI alignment. However, existing approaches to modeling human behavior require prohibitively large amounts of data from each individual, making them impractical for new or sparsely represented users. In this work, we introduce Maia4All, a framework designed to learn and adapt to individual decision-making styles efficiently, even with limited data. Maia4All achieves this through a two-stage optimization process: (1) an enrichment step, which bridges population and individual-level human behavior modeling with a prototype-enriched model, and (2) a democratization step, which leverages…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

- I have not seen the idea of matching embeddings of (low-resource) players with other prototypical players, which is quite interesting. - I like the ablation of strength and prototype-initialization without fine-tuning.

Weaknesses

- The paper is not sufficiently well-written. There are many things unclear to me as a reader. For example: - How does the model use the population/individual embeddings? - Since you want to predict next moves, why do you need a value head and auxiliary losses? - It is difficult for me to understand what’s going on in Figure 1. - A more fundamental issue with the work is its generality. This paper proposes an initialization technique to solve a problem that is very specific (low-reso

Reviewer 02Rating 3Confidence 3

Strengths

This paper addresses the challenge of behavior modeling in a realistic setting with limited player game records. Existing methods for modeling human behavior require extensive data from each individual, limiting their practical use. The motivation and areas for improvements are clearly outlined.

Weaknesses

1. Overall, I find certain aspects of the methodology presentation unclear. The study builds on Maia-2, but several settings from Maia-2 need further explanation. For example, in Section 3.1, population embeddings are introduced, but there is no explanation of how Maia-2's predictions are adjusted solely by varying these embeddings. A brief overview of how Maia-2 works with population embeddings would be helpful. Additionally, in Section 3.2, more detail is needed on the population embeddings an

Reviewer 03Rating 5Confidence 3

Strengths

- The paper introduces a novel application of few-shot learning in chess. The two-stage fine-tuning approach is particularly innovative, allowing the model to leverage rich data from prototype players while efficiently adapting to low-data individuals. - The experimental design is rigorous, with clear evaluation metrics such as move prediction accuracy and perplexity across various data scarcity settings. - Very well written and easy to follow: The paper includes step-by-step explanations and vi

Weaknesses

- This paper exclusively studies behavior modeling in the context of chess, raising concerns about whether the methods proposed can generalize effectively to other domains. While the results show that the authors have clearly made an improvement in behavior modeling in chess -- especially when player data is low -- the authors haven't thorough evidence that these methods can improve behavior modeling in other domains which limits the impact of this work. Concretely the authors say "Our work prov

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.