Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
Yichuan Ma, Linyang Li, Yongkang Chen, Peiji Li, Jiasheng Ye, Qipeng Guo, Dahua Lin, Kai Chen

TL;DR
This paper introduces LoGos, a large language model fine-tuned with Go expertise and reasoning data, achieving human professional-level Go gameplay and demonstrating the integration of expert domain knowledge with general reasoning in LLMs.
Contribution
The paper presents a novel fine-tuning and reinforcement learning approach to incorporate expert Go knowledge into LLMs, enabling high-level Go gameplay performance.
Findings
LoGos matches human professional Go players in performance.
LoGos surpasses all existing LLMs in Go gameplay.
First LLM to reach human professional-level in Go.
Abstract
Large language models (LLMs) have demonstrated exceptional performance in reasoning tasks such as mathematics and coding, matching or surpassing human capabilities. However, these impressive reasoning abilities face significant challenges in specialized domains. Taking Go as an example, although AlphaGo has established the high performance ceiling of AI systems in Go, mainstream LLMs still struggle to reach even beginner-level proficiency, let alone perform natural language reasoning. This performance gap between general-purpose LLMs and domain experts is significantly limiting the application of LLMs on a wider range of domain-specific tasks. In this work, we aim to bridge the divide between LLMs' general reasoning capabilities and expert knowledge in domain-specific tasks. We perform mixed fine-tuning with structured Go expertise and general long Chain-of-Thought (CoT) reasoning data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)
