Left Heavy Tails and the Effectiveness of the Policy and Value Networks   in DNN-based best-first search for Sokoban Planning

Dieqiao Feng; Carla Gomes; Bart Selman

arXiv:2206.14298·cs.AI·June 30, 2022

Left Heavy Tails and the Effectiveness of the Policy and Value Networks in DNN-based best-first search for Sokoban Planning

Dieqiao Feng, Carla Gomes, Bart Selman

PDF

Open Access

TL;DR

This paper investigates how deep neural network heuristics influence best-first search in Sokoban, revealing the surprising effectiveness of policy networks, the existence of heavy-tailed runtime distributions, and the benefits of random restarts.

Contribution

It demonstrates the critical role of policy and value networks in guiding search, introduces the concept of left heavy tails in runtime distributions, and proposes an abstract model explaining these phenomena.

Findings

01

Policy networks significantly improve search efficiency.

02

Heavy-tailed runtime distributions, including left heavy tails, are observed in Sokoban.

03

Random restarts help mitigate heavy-tailed search times.

Abstract

Despite the success of practical solvers in various NP-complete domains such as SAT and CSP as well as using deep reinforcement learning to tackle two-player games such as Go, certain classes of PSPACE-hard planning problems have remained out of reach. Even carefully designed domain-specialized solvers can fail quickly due to the exponential search space on hard instances. Recent works that combine traditional search methods, such as best-first search and Monte Carlo tree search, with Deep Neural Networks' (DNN) heuristics have shown promising progress and can solve a significant number of hard planning instances beyond specialized solvers. To better understand why these approaches work, we studied the interplay of the policy and value networks of DNN-based best-first search on Sokoban and show the surprising effectiveness of the policy network, further enhanced by the value network, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics