Leveraging Experience in Lazy Search
Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa

TL;DR
This paper introduces a learning-based approach to improve lazy graph search algorithms in motion planning by training a policy to select edges efficiently, resulting in faster solutions with theoretical guarantees.
Contribution
It formulates edge selection as an MDP, uses imitation learning with oracle policies, and provides theoretical analysis and regret guarantees for lazy search in motion planning.
Findings
Learned selector outperforms baseline heuristics in 2D and 7D problems.
The approach reduces the number of edge evaluations needed.
Provides theoretical analysis and regret bounds for the method.
Abstract
Lazy graph search algorithms are efficient at solving motion planning problems where edge evaluation is the computational bottleneck. These algorithms work by lazily computing the shortest potentially feasible path, evaluating edges along that path, and repeating until a feasible path is found. The order in which edges are selected is critical to minimizing the total number of edge evaluations: a good edge selector chooses edges that are not only likely to be invalid, but also eliminates future paths from consideration. We wish to learn such a selector by leveraging prior experience. We formulate this problem as a Markov Decision Process (MDP) on the state of the search problem. While solving this large MDP is generally intractable, we show that we can compute oracular selectors that can solve the MDP during training. With access to such oracles, we use imitation learning to find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
