SAI, a Sensible Artificial Intelligence that plays Go

Francesco Morandin; Gianluca Amato; Rosa Gini; Carlo Metta; and Maurizio Parton; Gian-Carlo Pascutto

arXiv:1809.03928·cs.AI·November 28, 2019

SAI, a Sensible Artificial Intelligence that plays Go

Francesco Morandin, Gianluca Amato, Rosa Gini, Carlo Metta, and Maurizio Parton, Gian-Carlo Pascutto

PDF

1 Repo

TL;DR

This paper introduces SAI, a Go AI that models winrates across different komi values using a sigmoid function, enhancing self-play training and score estimation on 7x7 Go.

Contribution

It presents a novel multiple-komi approach with sigmoid modeling, improving reinforcement learning efficiency and score estimation in Go AI.

Findings

01

Achieved strong playing agents on 7x7 Go.

02

Successfully modeled winrate as a function of komi with a sigmoid.

03

Enabled score difference estimation and game decisiveness evaluation.

Abstract

We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm. The winrate as a function of the komi is modeled with a two-parameters sigmoid function, so that the neural network must predict just one more variable to assess the winrate for all komi values. A second novel feature is that training is based on self-play games that occasionally branch -- with changed komi -- when the position is uneven. With this setting, reinforcement learning is showed to work on 7x7 Go, obtaining very strong playing agents. As a useful byproduct, the sigmoid parameters given by the network allow to estimate the score difference on the board, and to evaluate how much the game is decided.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sai-dev/sai
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.