Learning for Adaptive Real-time Search
Vadim Bulitko

TL;DR
This paper introduces an adaptive real-time search algorithm that learns heuristic functions, dynamically adjusts lookahead depth, and balances exploration and exploitation, significantly improving convergence speed in puzzle-solving tasks.
Contribution
It presents a novel algorithm that combines learning, adaptive lookahead, and user-controlled trade-offs, bridging the gap between simple greedy policies and complex planning.
Findings
Achieved 5 to 30 times faster convergence compared to classical methods.
Effectively balances exploration and exploitation during search.
Demonstrated significant improvements in sliding tile puzzle tests.
Abstract
Real-time heuristic search is a popular model of acting and learning in intelligent autonomous agents. Learning real-time search agents improve their performance over time by acquiring and refining a value function guiding the application of their actions. As computing the perfect value function is typically intractable, a heuristic approximation is acquired instead. Most studies of learning in real-time search (and reinforcement learning) assume that a simple value-function-greedy policy is used to select actions. This is in contrast to practice, where high-performance is usually attained by interleaving planning and acting via a lookahead search of a non-trivial depth. In this paper, we take a step toward bridging this gap and propose a novel algorithm that (i) learns a heuristic function to be used specifically with a lookahead-based policy, (ii) selects the lookahead depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · AI-based Problem Solving and Planning · Robotic Path Planning Algorithms
