Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess

Dongyoon Hwang; Hojoon Lee; Jaegul Choo; Dongmin Park; Jongho Park

arXiv:2507.00726·cs.AI·August 29, 2025

Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess

Dongyoon Hwang, Hojoon Lee, Jaegul Choo, Dongmin Park, Jongho Park

PDF

Open Access

TL;DR

This paper explores whether large language models can develop strategic reasoning in chess through reinforcement learning, revealing limitations due to their initial understanding of chess despite dense reward signals.

Contribution

It introduces a method using a chess-pretrained network for dense rewards in RL training of LLMs and analyzes the reasons behind the limited strategic reasoning development.

Findings

01

Dense rewards outperform sparse rewards in training

02

Models plateau below expert chess levels

03

Pretraining deficits limit strategic reasoning development

Abstract

While reinforcement learning (RL) for large language models (LLMs) has shown promise in mathematical reasoning, strategic reasoning for LLMs using RL remains largely unexplored. We investigate whether LLMs can develop strategic reasoning capabilities through RL in chess. To this end, we leverage a chess-pretrained action-value network to provide dense reward on the LLM's output move quality, which can be seen as a form of knowledge distillation. Our experiments show that our distillation-based dense rewards often outperform sparse binary rewards. However, surprisingly, all models plateau far below expert levels. We provide SFT and RL ablations on chess reasoning training and find evidence that this limitation stems from a deficit in the pretrained models' internal understanding of chess-a deficit which RL alone may not be able to fully overcome. The code is available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Explainable Artificial Intelligence (XAI)