Policy-Guided Heuristic Search with Guarantees

Laurent Orseau; Levi H. S. Lelis

arXiv:2103.11505·cs.AI·March 23, 2021·1 cites

Policy-Guided Heuristic Search with Guarantees

Laurent Orseau, Levi H. S. Lelis

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Policy-guided Heuristic Search (PHS), a new algorithm combining policies and heuristics with guarantees on search efficiency, improving problem-solving in complex domains.

Contribution

PHS is a novel search method that integrates policies and heuristics, providing theoretical guarantees on search loss and demonstrating superior empirical performance.

Findings

01

PHS outperforms A*, Weighted A*, Greedy Best-First Search, LevinTS, and PUCT.

02

PHS enables rapid learning of policies and heuristics.

03

PHS solves more problems faster across multiple domains.

Abstract

The use of a policy and a heuristic function for guiding search can be quite effective in adversarial problems, as demonstrated by AlphaGo and its successors, which are based on the PUCT search algorithm. While PUCT can also be used to solve single-agent deterministic problems, it lacks guarantees on its search effort and it can be computationally inefficient in practice. Combining the A* algorithm with a learned heuristic function tends to work better in these domains, but A* and its variants do not use a policy. Moreover, the purpose of using A* is to find solutions of minimum cost, while we seek instead to minimize the search loss (e.g., the number of search steps). LevinTS is guided by a policy and provides guarantees on the number of search steps that relate to the quality of the policy, but it does not make use of a heuristic function. In this work we introduce Policy-guided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

levilelis/h-levin
noneOfficial

Videos

Policy-Guided Heuristic Search with Guarantees· underline

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms