Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments
John Tan Chong Min, Mehul Motani

TL;DR
This paper introduces the Brick Tic-Tac-Toe test bed to evaluate the generalizability of reinforcement learning algorithms, revealing that traditional methods like MCTS outperform AlphaZero in novel environments, and that diverse training improves generalization.
Contribution
The paper presents a new test environment for assessing RL generalization and compares AlphaZero with traditional methods, highlighting the importance of training environment diversity.
Findings
MCTS outperforms AlphaZero in generalizing to novel test environments.
Increasing training environment diversity improves AlphaZero's generalization.
AlphaZero's performance is limited by environment variability despite its success in other games.
Abstract
Traditional reinforcement learning (RL) environments typically are the same for both the training and testing phases. Hence, current RL methods are largely not generalizable to a test environment which is conceptually similar but different from what the method has been trained on, which we term the novel test environment. As an effort to push RL research towards algorithms which can generalize to novel test environments, we introduce the Brick Tic-Tac-Toe (BTTT) test bed, where the brick position in the test environment is different from that in the training environment. Using a round-robin tournament on the BTTT environment, we show that traditional RL state-search approaches such as Monte Carlo Tree Search (MCTS) and Minimax are more generalizable to novel test environments than AlphaZero is. This is surprising because AlphaZero has been shown to achieve superhuman performance in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Artificial Intelligence in Games · Reinforcement Learning in Robotics
MethodsTest · AlphaZero
