Learning to Plan via Supervised Contrastive Learning and Strategic Interpolation: A Chess Case Study
Andrew Hamara, Greg Hamerly, Pablo Rivas, Andrew C. Freeman

TL;DR
This paper introduces a novel approach to chess move planning using supervised contrastive learning to embed game states into a structured latent space, enabling effective move selection without deep search.
Contribution
It presents a new embedding-based planning method trained with supervised contrastive learning, demonstrating competitive performance with shallow search in chess.
Findings
Model achieves estimated Elo of 2593 with 6-ply beam search.
Performance improves with larger models and higher embedding dimensions.
Embedding space allows interpretable visualization of game state transitions.
Abstract
Modern chess engines achieve superhuman performance through deep tree search and regressive evaluation, while human players rely on intuition to select candidate moves followed by a shallow search to validate them. To model this intuition-driven planning process, we train a transformer encoder using supervised contrastive learning to embed board states into a latent space structured by positional evaluation. In this space, distance reflects evaluative similarity, and visualized trajectories display interpretable transitions between game states. We demonstrate that move selection can occur entirely within this embedding space by advancing toward favorable regions, without relying on deep search. Despite using only a 6-ply beam search, our model achieves an estimated Elo rating of 2593. Performance improves with both model size and embedding dimensionality, suggesting that latent planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Robot Manipulation and Learning
MethodsContrastive Learning · Focus
