A Training Data Recipe to Accelerate A* Search with Language Models
Devaansh Gupta, Boyang Li

TL;DR
This paper proposes a data selection method for training language models to improve heuristic search efficiency, leading to up to 15x fewer iterations and 5x faster search in classical planning problems.
Contribution
It introduces a novel data-selection distribution for training LLM-based heuristics, optimizing search performance and reducing computational costs.
Findings
Up to 15x reduction in search iterations.
Up to 5x speed-up in wall-clock search time.
Effective heuristic learning across multiple classical planning domains.
Abstract
Combining Large Language Models (LLMs) with heuristic search algorithms like A* holds the promise of enhanced LLM reasoning and scalable inference. To accelerate training and reduce computational demands, we investigate the coreset selection problem for the training data of LLM heuristic learning. Few methods to learn the heuristic functions consider the interaction between the search algorithm and the machine learning model. In this work, we empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task. Surprisingly, we find an overlap between their requirements; A* requires more accurate predictions on search nodes near the goal, and LLMs need the same set of nodes for effective generalisation. With these insights, we derive a data-selection distribution for learning LLM-based heuristics. On three classical planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Information Retrieval and Search Behavior
MethodsSparse Evolutionary Training
