Graph Value Iteration
Dieqiao Feng, Carla P. Gomes, Bart Selman

TL;DR
This paper introduces a domain-independent method combining graph value iteration with search to improve reinforcement learning in complex planning tasks, learning from both successes and failures to handle large search spaces.
Contribution
It presents a novel, domain-agnostic approach that enhances graph search with value iteration, enabling learning from failed attempts and scaling to large, complex planning problems.
Findings
Effective in solving hard planning instances beyond domain-specific solvers
Learns from both successful and failed search attempts
Scales well with increasing problem complexity
Abstract
In recent years, deep Reinforcement Learning (RL) has been successful in various combinatorial search domains, such as two-player games and scientific discovery. However, directly applying deep RL in planning domains is still challenging. One major difficulty is that without a human-crafted heuristic function, reward signals remain zero unless the learning framework discovers any solution plan. Search space becomes \emph{exponentially larger} as the minimum length of plans grows, which is a serious limitation for planning instances with a minimum plan length of hundreds to thousands of steps. Previous learning frameworks that augment graph search with deep neural networks and extra generated subgoals have achieved success in various challenging planning domains. However, generating useful subgoals requires extensive domain knowledge. We propose a domain-independent method that augments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · AI-based Problem Solving and Planning
