Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel   Test Environments

John Tan Chong Min; Mehul Motani

arXiv:2207.05991·cs.LG·July 15, 2022

Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

John Tan Chong Min, Mehul Motani

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Brick Tic-Tac-Toe test bed to evaluate the generalizability of reinforcement learning algorithms, revealing that traditional methods like MCTS outperform AlphaZero in novel environments, and that diverse training improves generalization.

Contribution

The paper presents a new test environment for assessing RL generalization and compares AlphaZero with traditional methods, highlighting the importance of training environment diversity.

Findings

01

MCTS outperforms AlphaZero in generalizing to novel test environments.

02

Increasing training environment diversity improves AlphaZero's generalization.

03

AlphaZero's performance is limited by environment variability despite its success in other games.

Abstract

Traditional reinforcement learning (RL) environments typically are the same for both the training and testing phases. Hence, current RL methods are largely not generalizable to a test environment which is conceptually similar but different from what the method has been trained on, which we term the novel test environment. As an effort to push RL research towards algorithms which can generalize to novel test environments, we introduce the Brick Tic-Tac-Toe (BTTT) test bed, where the brick position in the test environment is different from that in the training environment. Using a round-robin tournament on the BTTT environment, we show that traditional RL state-search approaches such as Monte Carlo Tree Search (MCTS) and Minimax are more generalizable to novel test environments than AlphaZero is. This is surprising because AlphaZero has been shown to achieve superhuman performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tanchongmin/bttt
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Artificial Intelligence in Games · Reinforcement Learning in Robotics

MethodsTest · AlphaZero