Subgoal-Guided Policy Heuristic Search with Learned Subgoals
Jake Tuero, Michael Buro, Levi H. S. Lelis

TL;DR
This paper introduces a subgoal-guided policy learning method that improves the efficiency of policy tree search algorithms by utilizing search trees from both successful and failed attempts to learn better subgoal policies.
Contribution
It proposes a novel approach to learn subgoal-based policies from search trees, including failed attempts, enhancing sample efficiency in policy training.
Findings
Improved sample efficiency in learning policies and heuristics.
Effective use of search trees from failed attempts.
Enhanced performance in policy-guided search tasks.
Abstract
Policy tree search is a family of tree search algorithms that use a policy to guide the search. These algorithms provide guarantees on the number of expansions required to solve a given problem that are based on the quality of the policy. While these algorithms have shown promising results, the process in which they are trained requires complete solution trajectories to train the policy. Search trajectories are obtained during a trial-and-error search process. When the training problem instances are hard, learning can be prohibitively costly, especially when starting from a randomly initialized policy. As a result, search samples are wasted in failed attempts to solve these hard instances. This paper introduces a novel method for learning subgoal-based policies for policy tree search algorithms. The subgoals and policies conditioned on subgoals are learned from the trees that the search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
