PGD: A Large-scale Professional Go Dataset for Data-driven Analytics
Yifan Gao

TL;DR
The paper introduces PGD, a comprehensive large-scale dataset of professional Go games with detailed meta-information and AI-evaluated move analysis, enabling advanced data-driven analytics and benchmarking in Go.
Contribution
It provides the first extensive professional Go dataset with detailed annotations and AI analysis, facilitating research and benchmarking in data-driven Go analytics.
Findings
Achieved 75.30% accuracy in game state prediction.
Created a dataset with 98,043 games and detailed meta-information.
Outperformed state-of-the-art approaches in predictive accuracy.
Abstract
Lee Sedol is on a winning streak--does this legend rise again after the competition with AlphaGo? Ke Jie is invincible in the world championship--can he still win the title this time? Go is one of the most popular board games in East Asia, with a stable professional sports system that has lasted for decades in China, Japan, and Korea. There are mature data-driven analysis technologies for many sports, such as soccer, basketball, and esports. However, developing such technology for Go remains nontrivial and challenging due to the lack of datasets, meta-information, and in-game statistics. This paper creates the Professional Go Dataset (PGD), containing 98,043 games played by 2,148 professional players from 1950 to 2021. After manual cleaning and labeling, we provide detailed meta-information for each player, game, and tournament. Moreover, the dataset includes analysis results for each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Video Analysis and Summarization · Educational Games and Gamification
