Tracking vs. Deciding: The Dual-Capability Bottleneck in Searchless Chess Transformers
Quanhao Li, Wei Jiang

TL;DR
This paper investigates the dual capabilities of state tracking and decision-making in searchless chess transformers, proposing methods to balance and improve both through model scaling and weighted training.
Contribution
It formalizes the dual-capability bottleneck in chess models and introduces scaling and Elo-weighted training to enhance tracking and decision quality respectively.
Findings
Scaling improves tracking capabilities.
Elo-weighted training enhances decision quality.
Combined approach yields superadditive performance gains.
Abstract
A human-like chess engine should mimic the style, errors, and consistency of a strong human player rather than maximize playing strength. We show that training from move sequences alone forces a model to learn two capabilities: state tracking, which reconstructs the board from move history, and decision quality, which selects good moves from that reconstructed state. These impose contradictory data requirements: low-rated games provide the diversity needed for tracking, while high-rated games provide the quality signal for decision learning. Removing low-rated data degrades performance. We formalize this tension as a dual-capability bottleneck, P <= min(T,Q), where overall performance is limited by the weaker capability. Guided by this view, we scale the model from 28M to 120M parameters to improve tracking, then introduce Elo-weighted training to improve decisions while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
