An AlphaZero-Inspired Approach to Solving Search Problems

Evgeny Dantsin; Vladik Kreinovich; and Alexander Wolpert

arXiv:2207.00919·cs.AI·July 5, 2022·Decision Making Under Uncertainty and Constraints

An AlphaZero-Inspired Approach to Solving Search Problems

Evgeny Dantsin, Vladik Kreinovich, and Alexander Wolpert

PDF

Open Access

TL;DR

This paper explores adapting AlphaZero's reinforcement learning approach to solve search problems like SAT, by developing representations and a Monte Carlo tree search variant tailored for such problems.

Contribution

It introduces methods to represent search problems for AlphaZero-inspired solvers and adapts Monte Carlo tree search for these problems.

Findings

01

Proposes representations of search problems for AlphaZero-based solving

02

Develops a version of Monte Carlo tree search for search problems

03

Provides examples for the satisfiability problem

Abstract

AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from self-play, without any domain knowledge except the game rules. It is a natural idea to adapt the methods and techniques used in AlphaZero for solving search problems such as the Boolean satisfiability problem (in its search version). Given a search problem, how to represent it for an AlphaZero-inspired solver? What are the "rules of solving" for this search problem? We describe possible representations in terms of easy-instance solvers and self-reductions, and we give examples of such representations for the satisfiability problem. We also describe a version of Monte Carlo tree search adapted for search problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Residual Connection · Residual Block · Prioritized Experience Replay · Monte-Carlo Tree Search · Convolution · Average Pooling · MuZero · AlphaZero