Xiangqi-R1: Enhancing Spatial Strategic Reasoning in LLMs for Chinese Chess via Reinforcement Learning
Yuhao Chen, Shuochen Liu, Yuanjie Lyu, Chao Zhang, Jiayao Shi, Tong Xu

TL;DR
This paper introduces Xiangqi-R1, a large language model trained with a specialized dataset and reinforcement learning to improve spatial strategic reasoning in Chinese Chess, demonstrating significant performance gains over general-purpose LLMs.
Contribution
The paper presents a novel training framework and a 7B-parameter model specifically designed for Chinese Chess, enhancing strategic reasoning capabilities in LLMs.
Findings
Xiangqi-R1 achieves 18% higher move legality
Xiangqi-R1 improves analysis accuracy by 22%
Specialized training boosts LLM performance in complex spatial games
Abstract
Game playing has long served as a fundamental benchmark for evaluating Artificial General Intelligence. While Large Language Models (LLMs) have demonstrated impressive capabilities in general reasoning, their effectiveness in spatial strategic reasoning, which is critical for complex and fully observable board games, remains insufficiently explored. In this work, we adopt Chinese Chess (Xiangqi) as a challenging and rich testbed due to its intricate rules and spatial complexity. To advance LLMs' strategic competence in such environments, we propose a training framework tailored to Xiangqi, built upon a large-scale dataset of five million board-move pairs enhanced with expert annotations and engine evaluations. Building on this foundation, we introduce Xiangqi-R1, a 7B-parameter model trained in multi-stage manner. Our Experimental results indicate that, despite their size and power,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Artificial Intelligence in Games · Reinforcement Learning in Robotics
