# BoxesZero: An Efficient and Computationally Frugal Dots-and-Boxes Agent

**Authors:** Xuefen Niu, Qirui Liu, Wei Chen, Yujiao Zheng, Zhanggen Jin

PMC · DOI: 10.3390/e27030285 · Entropy · 2025-03-09

## TL;DR

BoxesZero is a computationally efficient Dots-and-Boxes agent that outperforms existing agents while using fewer resources.

## Contribution

BoxesZero introduces backward training and extended endgame theorems to improve learning efficiency in Dots-and-Boxes.

## Key findings

- BoxesZero achieves AlphaZero-level ELO ratings in significantly less time.
- BoxesZero won the 2024 Chinese Computer Game Competition in Dots-and-Boxes.
- Extended endgame theorems improve Monte Carlo Tree Search efficiency.

## Abstract

In recent years, deep reinforcement learning (DRL) has made significant progress in the field of games. A prime example is AlphaZero, which, despite the formidable capabilities showcased, deters many from exploring its potential because of its demands for substantial computational resources. In this paper, we introduce BoxesZero, a computationally frugal Dots-and-Boxes agent that can achieve a high level of performance using relatively fewer computational resources. BoxesZero utilizes a novel and insightful training approach called “backward training”, which starts by training from high-reward states near the end of the game and gradually trains earlier stages of the game. It also incorporates the domain knowledge of Dots-and-Boxes, such as endgame theorems, to accelerate the Monte Carlo Tree Search (MCTS) process. Furthermore, we extend the existing endgame theorems (which only include long chains) to encompass scenarios with 1-chains and 2-chains, providing corresponding proofs, which we refer to as the extended endgame theorems. This novel agent, BoxesZero, can achieve a high level of playing strength much faster than AlphaZero, substantially improving the model’s learning efficiency. With carefully tuned parameters and limited GPU resources, BoxesZero surpasses the strongest open-source Boxes agents, PRsboxes and DabbleBoxes. Experimental results demonstrate that BoxesZero achieves an ELO rating comparable to AlphaZero in significantly less time. Furthermore, BoxesZero won the championship in the Dots-and-Boxes category of the 2024 Chinese Computer Game Competition.

## Full-text entities

- **Chemicals:** AlphaZero (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11941521/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11941521/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC11941521/full.md

---
Source: https://tomesphere.com/paper/PMC11941521