Incremental Sampling-based Motion Planners Using Policy Iteration Methods
Oktay Arslan, Panagiotis Tsiotras

TL;DR
This paper introduces PI-RRT#, a novel sampling-based motion planning algorithm that employs policy iteration instead of value iteration, enabling faster convergence to optimal paths through parallelizable updates on promising vertices.
Contribution
The paper presents the first motion planning algorithm using policy iteration, improving convergence speed and parallelization over existing value iteration-based methods like RRT* and RRT#.
Findings
PI-RRT# converges faster to optimal paths.
Policy iteration enhances parallelization capabilities.
Outperforms existing algorithms in speed and solution quality.
Abstract
Recent progress in randomized motion planners has led to the development of a new class of sampling-based algorithms that provide asymptotic optimality guarantees, notably the RRT* and the PRM* algorithms. Careful analysis reveals that the so-called "rewiring" step in these algorithms can be interpreted as a local policy iteration (PI) step (i.e., a local policy evaluation step followed by a local policy improvement step) so that asymptotically, as the number of samples tend to infinity, both algorithms converge to the optimal path almost surely (with probability 1). Policy iteration, along with value iteration (VI) are common methods for solving dynamic programming (DP) problems. Based on this observation, recently, the RRT algorithm has been proposed, which performs, during each iteration, Bellman updates (aka "backups") on those vertices of the graph that have the potential of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
