dRRT*: Scalable and Informed Asymptotically-Optimal Multi-Robot Motion   Planning

Rahul Shome; Kiril Solovey; Andrew Dobson; Dan Halperin; Kostas E.; Bekris

arXiv:1903.00994·cs.RO·March 5, 2019

dRRT*: Scalable and Informed Asymptotically-Optimal Multi-Robot Motion Planning

Rahul Shome, Kiril Solovey, Andrew Dobson, Dan Halperin, Kostas E., Bekris

PDF

TL;DR

This paper introduces dRRT*, a scalable, informed, asymptotically-optimal multi-robot motion planner that efficiently finds high-quality paths in complex multi-robot scenarios, outperforming previous methods in scalability and convergence speed.

Contribution

It extends the prior dRRT algorithm to achieve theoretical optimality guarantees and incorporates heuristics for improved search efficiency in multi-robot configuration spaces.

Findings

01

dRRT* converges quickly to high-quality paths

02

The algorithm scales to more robots than previous methods

03

It successfully solves real-world multi-robot arm problems

Abstract

Many exciting robotic applications require multiple robots with many degrees of freedom, such as manipulators, to coordinate their motion in a shared workspace. Discovering high-quality paths in such scenarios can be achieved, in principle, by exploring the composite space of all robots. Sampling-based planners do so by building a roadmap or a tree data structure in the corresponding configuration space and can achieve asymptotic optimality. The hardness of motion planning, however, renders the explicit construction of such structures in the composite space of multiple robots impractical. This work proposes a scalable solution for such coupled multi-robot problems, which provides desirable path-quality guarantees and is also computationally efficient. In particular, the proposed \drrtstar\ is an informed, asymptotically-optimal extension of a prior sampling-based multi-robot motion…

Tables1

Table 1. Table 1: Construction and query times (seconds) for 2 disk robots.

Number of nodes: $N$ =	50	100	200
$N^{2}$ -PRM* construction	3.427	13.293	69.551
$N^{2}$ -PRM* query	0.002	0.005	0.019
2 $N$ -size PRM* construction	0.135	0.274	0.558
Implicit A* search over $\hat{} 𝔾$	0.886	4.214	15.468
$𝚊𝚘 - 𝚍𝚁𝚁𝚃$ over $\hat{} 𝔾$ (initial)	1.309	0.999	0.638
${𝚍𝚁𝚁𝚃}^{*}$ over $\hat{} 𝔾$ (initial)	0.003	0.002	0.002

Equations64

∥Σ (τ) - X ∥ \geq δ .

∥Σ (τ) - X ∥ \geq δ .

m \to \infty lim Pr [cost (Σ^{(m)}) \leq (1 + ϵ) c^{*}] = 1.

m \to \infty lim Pr [cost (Σ^{(m)}) \leq (1 + ϵ) c^{*}] = 1.

O_{d} (V^{near}, Q^{rand}) = V \in Adj (V^{near},^G) arg min ∠_{V^{near}} (Q^{rand}, V) .

O_{d} (V^{near}, Q^{rand}) = V \in Adj (V^{near},^G) arg min ∠_{V^{near}} (Q^{rand}, V) .

r (n) \geq r^{*} (n) = γ (\frac{lo g n}{n})^{\frac{1}{d}},

r (n) \geq r^{*} (n) = γ (\frac{lo g n}{n})^{\frac{1}{d}},

cost (Σ^{(n)}) \leq (1 + o (1)) \cdot cost (Σ)

cost (Σ^{(n)}) \leq (1 + o (1)) \cdot cost (Σ)

C_{i}^{o} (τ) = C_{i}^{o} \cup j = 1, j \neq = i ⋃ R I_{i}^{j} (σ_{j} (τ)) .

C_{i}^{o} (τ) = C_{i}^{o} \cup j = 1, j \neq = i ⋃ R I_{i}^{j} (σ_{j} (τ)) .

Q = (σ^{1} (τ), \dots, q_{i}, \dots, σ^{R} (τ)) .

Q = (σ^{1} (τ), \dots, q_{i}, \dots, σ^{R} (τ)) .

δ

δ

\displaystyle=\bigg{(}\|\sigma_{i}(\tau)-q_{i}\|^{2}+\sum_{j=1,j\neq i}^{R}\|\sigma_{j}(\tau)-\sigma_{j}(\tau)\|^{2}\bigg{)}^{\frac{1}{2}}

\leq ∥ σ_{i} (τ) - q_{i} ∥.

cost (Σ^{(n)}) = i = 1 \sum R ∥ σ_{i}^{(n)} ∥

cost (Σ^{(n)}) = i = 1 \sum R ∥ σ_{i}^{(n)} ∥

\leq (1 + o (1)) cost (Σ) .

τ_{i}^{j} = τ \in [0, 1] arg min ∥ v_{i}^{j} - σ_{i} (τ) ∥.

τ_{i}^{j} = τ \in [0, 1] arg min ∥ v_{i}^{j} - σ_{i} (τ) ∥.

Σ = (f_{1} (t), \dots f_{d} (t)),

Σ = (f_{1} (t), \dots f_{d} (t)),

∥Σ∥ = \int (f_{1}^{'} (t)^{2} + \dots + f_{d}^{'} (t)^{2}) d t .

∥Σ∥ = \int (f_{1}^{'} (t)^{2} + \dots + f_{d}^{'} (t)^{2}) d t .

s (x) = ∥Σ : [0, x] ∥, x \leq 1.

s (x) = ∥Σ : [0, x] ∥, x \leq 1.

s(x)=\sup_{P}\sum_{j=1}^{m}\sqrt{\bigg{(}\sum_{i=1}^{d}(f_{i}(t_{j})-f_{i}(t_{j-1}))^{2}\bigg{)}}.

s(x)=\sup_{P}\sum_{j=1}^{m}\sqrt{\bigg{(}\sum_{i=1}^{d}(f_{i}(t_{j})-f_{i}(t_{j-1}))^{2}\bigg{)}}.

s (x)_{P} = j = 1 \sum m i = 1 \sum d (f_{i} (P (j)) - f_{i} (P (j - 1)))^{2} .

s (x)_{P} = j = 1 \sum m i = 1 \sum d (f_{i} (P (j)) - f_{i} (P (j - 1)))^{2} .

μ_{j - 1, j}^{P} = s (P (j)) - s (P (j - 1)) .

μ_{j - 1, j}^{P} = s (P (j)) - s (P (j - 1)) .

\exists L s.t. s (1)_{Q^{*}} = s (1)_{P^{*}} = ∥Σ∥,

\exists L s.t. s (1)_{Q^{*}} = s (1)_{P^{*}} = ∥Σ∥,

Q^{*} \supseteq P^{*}, ∣ Q^{*} ∣ = L + 1,

μ_{j - 1, j} = μ_{k - 1, k} = \frac{∥Σ∥}{L} \in R_{+} \forall j, k \in [1, \dots L] .

∥ Σ_{R d} ∥

∥ Σ_{R d} ∥

= l = 1 \sum L i = 1 \sum R d (f_{i} (Q^{*} (j)) - f_{i} (Q^{*} (j - 1)))^{2}

= l = 1 \sum L i = 1 \sum d (δ f_{i})^{2} + \dots + i = (R - 1) d + 1 \sum R d (δ f_{i})^{2}

= l = 1 \sum L ∥ Σ_{1} (l) ∥^{2} + \dots + ∥ Σ_{R} (l) ∥^{2}

= l = 1 \sum L i = 1 \sum R ∥ Σ_{i} (l) ∥^{2},

∥ σ_{i}^{(n)} ∥ \leq (1 + o (1)) ∥ σ_{i} ∥

∥ σ_{i}^{(n)} ∥ \leq (1 + o (1)) ∥ σ_{i} ∥

\Rightarrow

\Rightarrow

\Rightarrow

R

i = 1 \sum R ∥ σ_{i}^{(n)} (l) ∥^{2} \leq (1 + o (1))^{2} i = 1 \sum R ∥ σ_{i} (l) ∥^{2}

\Rightarrow

\Rightarrow

n, m \to \infty lim Pr [cost (Σ^{(n, m)}) \leq (1 + ϵ) c^{*}] = 1.

n, m \to \infty lim Pr [cost (Σ^{(n, m)}) \leq (1 + ϵ) c^{*}] = 1.

\liminf_{m\to\infty}\mathbb{P}\big{(}O^{(m)}_{t}\big{)}=1.

\liminf_{m\to\infty}\mathbb{P}\big{(}O^{(m)}_{t}\big{)}=1.

δ \leq ∣∣ σ_{r} (τ) - q_{r} ∣∣.

δ \leq ∣∣ σ_{r} (τ) - q_{r} ∣∣.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

∎

11institutetext: Rahul Shome, and Kostas Bekris 22institutetext: Computer Science Dept. of Rutgers Univ., NJ, USA

22email: {rahul.shome, kostas.bekris}@cs.rutgers.com 33institutetext: Kiril Solovey, and Dan Halperin 44institutetext: Computer Science Dept. of Tel Aviv Univ., Israel

44email: {kirilsol, danha}@post.tau.ac.il 55institutetext: Andrew Dobson 66institutetext: Electrical Engineering and Computer Science Dept. of Univ. of Michigan, MI, USA

66email: [email protected]

dRRT*: Scalable and Informed Asymptotically-Optimal

Multi-Robot Motion Planning

Rahul Shome

Kiril Solovey

Andrew Dobson

Dan Halperin

Kostas E. Bekris A. Dobson, R. Shome and K. Bekris were supported by NSF IIS 1617744 and CCF 1330789.K. Solovey and D. Halperin’s work has been supported in part by the Israel Science Foundation (grant no. 825/15) and by the Blavatnik Computer Science Research Fund. Kiril Solovey has also been supported by the Clore Israel Foundation.

Abstract

Many exciting robotic applications require multiple robots with many degrees of freedom, such as manipulators, to coordinate their motion in a shared workspace. Discovering high-quality paths in such scenarios can be achieved, in principle, by exploring the composite space of all robots. Sampling-based planners do so by building a roadmap or a tree data structure in the corresponding configuration space and can achieve asymptotic optimality. The hardness of motion planning, however, renders the explicit construction of such structures in the composite space of multiple robots impractical. This work proposes a scalable solution for such coupled multi-robot problems, which provides desirable path-quality guarantees and is also computationally efficient. In particular, the proposed ${\tt dRRT^{*}}$ is an informed, asymptotically-optimal extension of a prior sampling-based multi-robot motion planner, ${\tt dRRT}$ . The prior approach introduced the idea of building roadmaps for each robot and implicitly searching the tensor product of these structures in the composite space. This work identifies the conditions for convergence to optimal paths in multi-robot problems, which the prior method was not achieving. Building on this analysis, ${\tt dRRT}$ is first properly adapted so as to achieve the theoretical guarantees and then further extended so as to make use of effective heuristics when searching the composite space of all robots. The case where the various robots share some degrees of freedom is also studied. Evaluation in simulation indicates that the new algorithm, ${\tt dRRT^{*}}$ converges to high-quality paths quickly and scales to a higher number of robots where various alternatives fail. This work also demonstrates the planner’s capability to solve problems involving multiple real-world robotic arms.

††journal: AURO

1 Introduction

A variety of robotic applications, ranging from manufacturing to logistics and service robotics, involve multiple robotic systems operating in the same workspace. In traditional, industrial domains, such as car manufacturing, the environment is fully known and predictable. This allows the robots to operate in a highly scripted manner by repeating the same predefined motions as fast as possible. New types of tasks, however, require robotic manipulators that compute high-quality paths on the fly. For instance, a team of robotic arms can be tasked to pick and sort a variety of objects that are dynamically placed on a common surface. Multiple challenges need to be addressed in the context of such applications, such as detecting the configuration of the objects and grasping. This work deals with the multi-robot motion planning (MMP) problem (Wagner and Choset, 2013; Gravot and Alami, 2003a; Gharbi et al, 2009) in the context of such setups, i.e., computing the paths of multiple, high-dimensional systems, such as robotic arms, that operate in a shared workspace, as shown in Figure 1. The focus is to solve MMP in a computationally efficient way as well as in a coupled manner, which allows to argue about the quality of the resulting paths.

Planning for multiple, high-dimensional robotic systems is quite challenging. The motion planning problem is already computationally hard for a single robot (Canny, 1988) that is a kinematic chain of rigid bodies. Thus, most approaches for multi-robot motion planning either quickly become intractable as the number of robots increases or alternatively sacrifice completeness and path quality guarantees. In particular, problem instances are especially hard when the robots operate in a shared workspace and in close proximity. In this case, it is not easy to reason for the robots in a decoupled manner. Instead, it is necessary to operate in the composite configuration space of all robots. The space requirements, however, for solving motion planning instances increase exponentially with problem dimensionality. The composite space of all robots in MPP instances is typically very high-dimensional to explore in a comprehensive and resolution complete manner, such as discretizing it with a grid and searching over it.

Sampling-based planners aim to help with such dimensionality issues by approximating the connectivity of the underlying configuration space. They construct graph-based representations, such as a roadmap or a tree data structure, which store collision free configurations and paths through a sampling process. Under certain conditions regarding the density of the corresponding graph, sampling-based planners can provide desirable path quality guarantees. Specifically, they achieve asymptotic optimality, i.e., as the sampling process progresses, the best path on the graph converges to the optimal one in the original configuration space. Nevertheless, even sampling-based planners face significant space challenges in the context of MPP problem, such as the one shown in Figure 1, which corresponds to a 28-dimensional space. In particular, it becomes infeasible with standard, asymptotically optimal sampling-based planners to explicitly store a graph in the corresponding space that will allow the discovery of a solution in practice. This is due to the large number of samples required to cover an exponentially larger volume as the dimensionality of the underlying space increases. Asymptotically optimal planners must maintain in the order of $logn$ edges per sample, where $n$ is the number of samples. Thus, when planning for high-dimensional systems, the space requirements of the corresponding roadmaps surpass the capabilities of standard workstations rather quickly.

A previously proposed sampling-based planner specifically designed for multi-robot problems, called ${\tt dRRT}$ (Solovey et al, 2015a), achieved progress in this area by leveraging an implicit representation of the composite space in order to provide both completeness and efficiency. This implicit representation is a graph, which corresponds to the tensor product of roadmaps explicitly constructed for each robot. This allows finding solutions for relatively high-dimensional multi-robot motion planning problems. Nevertheless, this prior method did not provide any path quality guarantees.

One key contribution of this work is to show that the structure of this implicit representation is guaranteed (asymptotically) to contain the optimal path for a set of robots moving simultaneously. Nevertheless, defining an implicit graph that contains a high-quality solution does not guarantee that the final solution is optimal unless the search process over this graph is appropriate. While a provably optimal search approach, such as $\tt A^{\text{*}}$ , could be implemented to search this graph, the extremely large branching factor of the implicit roadmap makes this prohibitively expensive, especially in the context of anytime planning. Instead, this work leverages the observation that a sampling-based method inspired by RRT∗, which maintains a spanning tree over the underlying implicit graph, will return optimal solutions if it allows rewiring operations during the spanning tree construction. Namely, it must converge to the tree with all of the minimum-cost paths starting from the initial query state to each other node in the graph. Further, this work shows that for a broad range of cost functions over paths in this graph can be used while still guaranteeing the proposed ${\tt dRRT^{*}}$ approach will asymptotically converge towards such a tree.

This paper is an extension of prior work (Dobson et al, 2017), which introduced an initial version of the ${\tt dRRT^{*}}$ and the sufficient conditions for generating an asymptotically optimal planner in this context. The current manuscript provides the following extensions:

•

A more thorough analysis that shows that the desirable guarantee can be achieved for an additional distance metric for multi-robot motion planning;

•

A more detailed description of the method, which has been further improved for computational efficiency purposes through the appropriate incorporation of heuristics;

•

The method has been extended to handle systems with shared degrees of freedom, as shown in related work (Shome and Bekris, 2017).

•

The experimental section has been extended to include the new methods as well as demonstrations on physical platforms.

The following section summarizes related prior work on the subject before Section 3 introduces the problem setup. Section 4 describes the underlying structure of the implicit tensor-roadmap and the previous method ${\tt dRRT}$ (Solovey et al, 2015a). The changes to ${\tt dRRT}$ necessary to achieve asymptotic optimality and computational efficiency, which result to the proposed algorithm ${\tt dRRT^{*}}$ are presented in Section 5. An analysis of the properties of the method are showcased in Section 6. The method is extended, in Section 7 to systems with shared degrees of freedom. Section 8 evaluates the methods experimentally and demonstrates their performance.

2 Prior Work

The multi-robot motion planning problem (MMP) is notoriously difficult as it involves many degrees of freedom, and consequently a vast search space, as each additional robot introduces several additional degrees of freedom to the problem. Certain instances of the problem can be solved efficiently, i.e., in polynomial run time, and in a complete manner, at times even with optimality guarantees on the solution costs (Turpin et al, 2013; Adler et al, 2015; Solovey et al, 2015b). However, in general MMP is computationally intractable (Hopcroft et al, 1984; Spirakis and Yap, 1984; Solovey and Halperin, 2016; Johnson, 2018).

Decoupled MMP techniques (Erdmann and Lozano-Perez, 1987; Ghrist et al, 2005; LaValle and Hutchinson, 1998; Peng and Akella, 2004; Van Den Berg and Overmars, 2005; Van Den Berg et al, 2009) reduce search space size by partitioning the problem into several subproblems, which are solved separately. Then, the different solutions are combined. These methods, however, typically lack completeness and optimality guarantees. While some hybrid approaches can take advantage of the inherent decoupling between robots and provide guarantees (Van Den Berg et al, 2009), they are often limited to discrete domains. The problem is more complex when the robots exhibit non-trivial dynamics (Peng and Akella, 2005). Collision avoidance or control methods can scale to many robots, but lack path quality guarantees (Van Den Berg et al, 2011; Tang and Kumar, 2015).

In contrast to that, centralized approaches (Kloder and Hutchinson, 2005; O’Donnell and Lozano-Pérez, 1989; Salzman et al, 2015; Solovey et al, 2015a; Svestka and Overmars, 1998; Wagner and Choset, 2013) usually work in the combined high-dimensional configuration space, and thus tend to be slower than decoupled techniques. However, centralized algorithms often come with stronger theoretical guarantees, such as completeness. Through the rest of this section we will consider centralized methods, with an emphasis on sampling-based approaches.

Sampling-based algorithms for a single robot (Kavraki et al, 1996; LaValle and Kuffner, 1999; Karaman and Frazzoli, 2011) can be extended to the multi-robot case by considering the fleet of robots as one composite robot (Gildardo, 2002). Such an approach suffers from inefficiency as it overlooks aspects of multi-robot planning, and hence can handle only a very small number of robots. Several techniques tailored for instances of MMP involving a small number of robots have been described (Hirsch and Halperin, 2004; Salzman et al, 2015).

In previous work (Solovey and Halperin, 2014), an extension of MMP was introduced, which consists of several groups of interchangeable robots. At the heart of the algorithm is a novel technique where the problem is reduced to several discrete pebble-motion problems (Kornhauser et al, 1984; Luna and Bekris, 2011; Yu and LaValle, 2013). These reductions amplify basic samples into massive collections of free placements and paths for the robots. An improved version (Krontiris et al, 2015) of this algorithm applied it to rearrange multiple objects using a robotic manipulator.

Previous work (Svestka and Overmars, 1998) introduced a different approach, which leverages the following fundamental observation: the structure of the overall high-dimensional multi-robot configuration space can be inferred by first considering independently the free space of every robot, and combining these subspaces in a meaningful manner to account for robot-robot collisions. They suggested an approach which combines roadmaps constructed for individual robots into one tensor-product roadmap $\hat{}\mathbb{G}$ , which captures the structure of the joint configuration space (see more information in Section 4).

Due to the exponential nature of the resulting roadmap, this technique is only applicable to problems that involve a modest number of robots. A recent work (Wagner and Choset, 2013) suggests that $\hat{}\mathbb{G}$ does not necessarily have to be explicitly represented. They apply their ${\tt M^{*}}$ algorithm to efficiently retrieve paths over $\hat{}\mathbb{G}$ , while minimizing the explored portion of the roadmap. The resulting technique is able to cope with a large number of robots, for certain types of scenarios. However, when the degree of simultaneous coordination between the robots increases, there is a sharp increase in the running time of this algorithm, as it has to consider many neighbors of a visited vertex of $\hat{}\mathbb{G}$ . This makes ${\tt M^{*}}$ less effective when the motion of multiple robots needs to be tightly coordinated.

Recently a different sampling-based framework for MMP was introduced, which combines an implicit representation of $\hat{}\mathbb{G}$ with a novel approach for pathfinding in geometrically-embedded graphs tailored for MMP (Solovey et al, 2015a) . The discrete-RRT ( ${\tt dRRT}$ ) algorithm is an adaptation of the celebrated ${\tt RRT}$ algorithm for the discrete case of a graph, and it enables a rapid exploration of the high-dimensional configuration space by carefully walking through an implicit representation of the tensor product of roadmaps for the individual robots (see extensive description in Section 4). The approach was demonstrated experimentally on scenarios that involve as many as $60$ DoFs and on scenarios that require tight coordination between robots. On most of these scenarios ${\tt dRRT}$ was faster by a factor of at least ten when compared to existing algorithms,including the aforementioned ${\tt M^{*}}$ .

Later, ${\tt dRRT}$ was applied to motion planning of a free-flying multi-link robot (Salzman et al, 2016). In that case, ${\tt dRRT}$ allowed to efficiently decouple between costly self-collision checks, which were done offline, and robot-obstacle collision checks, by traversing an implicitly-defined roadmap, whose structure resembles to that of $\hat{}\mathbb{G}$ . ${\tt dRRT}$ has also been used in the study of the effectiveness of metrics for MMP, which are an essential ingredient in sampling-based planners (Atias et al, 2017).

The current work proposes ${\tt dRRT^{*}}$ and shows that it is an efficient asymptotically optimal extension of the previously proposed ${\tt dRRT}$ . The ${\tt dRRT^{*}}$ framework is an anytime algorithm, which quickly finds initial solutions and then refines them, while ensuring asymptotic convergence to optimal solutions. Simulations show that the method practically generates high-quality paths while scaling to complex, high-dimensional problems, where alternatives fail.

3 Problem Setup and Notation

We start with a definition of the problem. Consider a shared workspace with $R\geq 2$ holonomic robots, each operating in a $d$ -dimensional configuration space $\mathbb{C}_{i}\subset\mathbb{R}^{d}$ for $1\leq i\leq R$ . For a given robot $i$ , denote its free space, i.e., the set of all collision free configurations, by $\mathbb{C}^{\textup{f}}_{i}\subset\mathbb{C}_{i}$ , and the obstacle space by $\mathbb{C}^{\textup{o}}_{i}=\mathbb{C}_{i}\setminus\mathbb{C}^{\textup{f}}_{i}$ .

The composite configuration space $\mathbb{C}=\prod^{R}_{i=1}\mathbb{C}_{i}$ is the Cartesian product of each robot’s configuration space. That is, a composite configuration $Q=(q_{1},\ldots,q_{R})\in\mathbb{C}$ is an $R$ -tuple of robot configurations. For two distinct robots $i,j$ , denote by $I_{i}^{j}(q_{j})\subset\mathbb{C}_{i}$ the set of configurations of robot $i$ , which lead into collision with robot $j$ at its configuration $q_{j}$ . Then, the composite free space $\mathbb{C}^{\textup{f}}\subset\mathbb{C}$ consists of configurations $Q=(q_{1},\ldots,q_{R})$ in which robots do not collide with obstacles or pairwise with each other. Formally:

$\bullet$

$q_{i}\in\mathbb{C}^{\textup{f}}_{i}$ for every $1\leq i\leq R$ ;

$\bullet$

$q_{i}\not\in I_{i}^{j}(q_{j}),q_{j}\not\in I_{j}^{i}(q_{i})$ for every $1\leq i<j\leq R$ .

The composite obstacle space is defined as $\mathbb{C}^{\textup{o}}=\mathbb{C}\setminus\mathbb{C}^{\textup{f}}$ .

Multi-robot planning is concerned with finding (collision-free) composite trajectories of the form $\Sigma:[0,1]\rightarrow\mathbb{C}^{\textup{f}}$ . $\Sigma$ is an $R$ -tuple $(\sigma_{1},\ldots,\sigma_{R})$ of single-robot trajectories $\sigma_{i}:[0,1]\rightarrow\mathbb{C}_{i}$ .

This work is concerned with producing high-quality trajectories, which minimize certain cost functions. In particular, we consider three cost functions $\textup{cost}(\cdot)$ , which are presented below. Let $\Sigma=(\sigma_{1},\ldots,\sigma_{R})$ be a composite trajectory. For the following, $\|\cdot\|$ denotes the standard arc length of a curve:

$\bullet$

The sum of path lengths: $\textup{cost}(\Sigma)=\sum_{i=1}^{R}\|\sigma_{i}\|$ .

$\bullet$

The maximum path length: $\textup{cost}(\Sigma)=\max_{i=1:R}\|\sigma_{i}\|$ .

$\bullet$

The Euclidean arc length of $\Sigma$ : $\textup{cost}(\Sigma)=\|\Sigma\|$

This work presents ${\tt dRRT^{*}}$ as an efficient, anytime solution to the robustly-feasible composite motion planning (RFCMP) problem :

Definition 1 (RFCMP)

Given $R$ robots operating in composite configuration space $\mathbb{C}=\prod^{R}_{i=1}\mathbb{C}_{i}$ , and for a given query $S=(s_{1},\ldots,s_{R}),T=(t_{1},\ldots,t_{R})$ , an RFCMP problem is one which yields a robustly-feasible trajectory $\Sigma:[0,1]\rightarrow\mathbb{C}^{\textup{f}}$ and $\Sigma(0)=S,\Sigma(1)=T$ . Namely, there exists a fixed constant $\delta>0$ such that $\forall\ \tau\in[0,1],X\in\mathbb{C}^{\textup{o}}$ it holds that

[TABLE]

One of the primary objectives of this work is to provide asymptotic optimality in the composite configuration space without explicitly constructing a planning structure in this space.

Definition 2 (Asymptotic Optimality)

Let $m$ be the time budget of the algorithm and a robustly optimal solution $\Sigma^{(m)}$ of cost $c^{*}$ is returned after time $m$ , then asymptotic optimality is defined as ensuring that the following holds true for any $\epsilon>0$ .

[TABLE]

4 Algorithmic Foundations

This section provides a detailed description of the discrete- ${\tt RRT}$ ( ${\tt dRRT}$ ) method (Solovey et al, 2015a), which is the basis of our method presented in Section 5. ${\tt dRRT}$ was posed as an efficient way to search an implicitly defined tensor-product roadmap, which captures the structure of $\mathbb{C}$ without explicitly sampling this space.

4.1 Tensor-product roadmap

Here we provide a formal definition of the tensor-product roadmap that ${\tt dRRT}$ is designed to explore. For every robot $1\leq i\leq R$ construct a PRM graph (Kavraki et al, 1996), denoted by $\mathbb{G}_{i}=(\mathbb{V}_{i},\mathbb{E}_{i})$ , which is embedded in $\mathbb{C}^{\textup{f}}_{i}$ . That is, $\mathbb{G}_{i}$ can be viewed as an approximation of $\mathbb{C}^{\textup{f}}_{i}$ and encodes collision free motions for robot $i$ . The construction of $\mathbb{G}_{i}$ is determined by two parameters $n$ and $r_{n}$ , which represent the number of samples, and the connection radius, respectively. As will be discussed in the following sections, it is necessary the roadmaps $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ to be constructed with certain range of parameters to guarantee asymptotic optimality of the new planners (Section 5).

Define the tensor-product roadmap, denoted by $\hat{}\mathbb{G}=(\mathbb{\hat{V}},\mathbb{\hat{E}})$ , as the tensor product between $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ (see Figure 2). Each vertex of $\hat{}\mathbb{G}$ describes a simultaneous placement of the $R$ robots, and similarly an edge of $\hat{}\mathbb{G}$ describes a simultaneous motion of the robots. Formally, $\mathbb{\hat{V}}=\{(v_{1},v_{2},\dots,v_{R}):\forall\ i,\ v_{i}\in\mathbb{V}_{i}\}$ is the Cartesian product of the nodes from each roadmap $\mathbb{G}_{i}$ . For two vertices $V=(v_{1},\ldots,v_{m})\in\mathbb{\hat{V}},V^{\prime}=(v^{\prime}_{1},\ldots,v^{\prime}_{m})\in\mathbb{\hat{V}}$ , the edge set $\mathbb{\hat{E}}$ contains edge $(V,V^{\prime})$ if $\forall i\in[1,R]:\ v_{i}=v^{\prime}_{i}$ or $(v_{i},v^{\prime}_{i})\in\mathbb{E}_{i}$ .111Notice this difference from the original ${\tt dRRT}$ (Solovey et al, 2015a) so as to allow edges where some robots remain motionless. Note that by the definition of $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ , the motion described by each edge $E\in\mathbb{\hat{E}}$ represents a path for the $R$ robots in which the robots do not collide with obstacles. However, collisions between pairs of robots still may be possible.

It is important to note that the tensor-product roadmap has $\|\mathbb{\hat{V}}\|=\prod_{i=1}^{R}\|\mathbb{\hat{V}}_{i}\|$ vertices. Given the neighborhood of a node $v_{i}$ in $\mathbb{G}_{i}$ as $\mathtt{Adj}(v_{i},\mathbb{G}_{i})$ , the size of the neighborhood of a node $v=\{v_{1}\dots v_{R}\}$ in $\hat{}\mathbb{G}$ is $\|\mathtt{Adj}(v,\hat{}\mathbb{G})\|=\prod_{i=1}^{R}\|\mathtt{Adj}(v_{i},\mathbb{G}_{i})\|$ . Using the much smaller $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ to construct $\hat{}\mathbb{G}$ online is computationally beneficial.

The presented algorithms share a common set of input and output parameters, such as the configuration space decompositions, which are predefined. In practice, the algorithms use pre-computed roadmaps in each constituent space online. The collision volumes that correspond to the robot and obstacles in the scene are also used online for validation. The algorithms output a trajectory in the configuration space of all robots, which is collision free with all obstacles and among robots.

4.2 Discrete RRT

An explicit construction of $\hat{}\mathbb{G}$ is possible in very limited settings that either involve few robots, e.g., $R=2$ , or when the underlying single-robot roadmaps have few vertices and edges. However, in general it is prohibitively costly to fully represent it due to its size, which grows exponentially with the number of robots, in terms of the number of vertices. Moreover, in some cases it may be even a challenge to represent all the edges adjacent to a single vertex of $\hat{}\mathbb{G}$ , as there may be exponentially many of those.

The ${\tt dRRT}$ algorithm enjoys the rich structure that $\hat{}\mathbb{G}$ offers (see Section 6) without explicitly representing it. In particular, it gathers information on $\hat{}\mathbb{G}$ only from the single-robot roadmaps $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ .

Similarly to the single-robot planner ${\tt RRT}$ (LaValle and Kuffner, 1999), ${\tt dRRT}$ grows a tree rooted at the start state of the given query (Line 1). ${\tt dRRT}$ restricts the growth of its tree $\mathbb{T}$ to the tensor-product roadmap $\hat{}\mathbb{G}$ in contrast to ${\tt RRT}$ , which explores the entire space $\mathbb{C}$ . That is, $\mathbb{T}$ is a subgraph of $\hat{}\mathbb{G}$ , and $\mathbb{T}\subset\mathbb{C}^{\textup{f}}$ .

The high level operations of the ${\tt dRRT}$ approach are outlined in Algorithm 1. The approach will iterate until a solution is found or the time limit is exceeded (Algorithm 1, Line 2), beginning with performing a fixed number $n_{\textrm{it}}$ expansion steps at the beginning of each iteration (Lines 3, 4). This expansion process is outlined in Algorithm 2. The approach then checks to see if there is a connected path to the target (Line 5), and once a path is found, it is returned (Lines 6, 7).

The expansion procedure begins by drawing a random sample $Q^{\textup{rand}}\in\mathbb{C}$ (Line 1). It then finds the nearest neighbor $V^{\textup{near}}$ in the tree (Line 2) and then selects a neighbor $V^{\textup{new}}$ , such that $(V^{\textup{near}},V^{\textup{new}})\in\mathbb{\hat{E}}$ , according to a direction oracle function $\mathbb{O}_{d}$ (Line 3). Then, if $V^{\textup{new}}$ is not in the tree (Line 4), it is added to the tree (Line 5) and an edge from $V^{\textup{near}}$ to $V^{\textup{new}}$ is also added (Line 6).

We now elaborate on $\mathbb{O}_{d}$ . Given $V^{\textup{near}},Q^{\textup{rand}}$ , the oracle returns a vertex $V^{\textup{new}}\in\mathbb{\hat{V}}$ that is the neighbor of $V^{\textup{near}}$ (in $\hat{}\mathbb{G}$ ) found in the direction of $Q^{\textup{rand}}$ . The crux of the approach is that $\mathbb{O}_{d}$ can come up with such a neighbor efficiently without relying on explicit representation of $\hat{}\mathbb{G}$ . Let $Q,Q^{\prime},Q^{\prime\prime}\in\mathbb{C}$ and define $\rho(Q,Q^{\prime})$ to be the ray through $Q^{\prime}$ starting at $Q$ . Then, denote $\angle_{Q}(Q^{\prime},Q^{\prime\prime})$ as the minimum angle between $\rho(Q,Q^{\prime})$ and $\rho(Q,Q^{\prime\prime})$ . Denote by $\mathtt{Adj}(V^{\textup{near}},\hat{}\mathbb{G})$ the set of neighbor nodes of $V^{\textup{near}}$ in $\hat{}\mathbb{G}$ , i.e., for every $V\in\mathtt{Adj}(V^{\textup{near}},\hat{}\mathbb{G})$ it holds that $(V^{\textup{near}},V)\in\mathbb{\hat{E}}$ . Then

[TABLE]

The implementation of $\mathbb{O}_{d}$ (Algorithm 5) proceeds in the following manner (see a two-robot case illustrated in Figure 3). Let $Q^{\textup{rand}}=(q^{\textup{rand}}_{1},\ldots,q^{\textup{rand}}_{R}),V^{\textup{near}}=(v^{\textup{near}}_{1},\ldots,v^{\textup{near}}_{R})$ . For every robot $1\leq i\leq R$ , the oracle extracts from $\mathbb{G}_{i}$ the neighbor $v^{\textup{new}}_{i}$ of $v^{\textup{near}}_{i}$ , which minimizes the expression $\angle_{v^{\textup{near}}_{i}}(q^{\textup{rand}}_{i},v^{\textup{new}}_{i})$ . Notice that such a search can be performed efficiently as it only requires to traverse all the neighbors of $v^{\textup{near}}_{i}$ in $\mathbb{G}_{i}$ . The combination of all $v^{\textup{near}}_{i}$ yields $V^{\textup{near}}$ .

As in ${\tt RRT}$ , ${\tt dRRT}$ has a Voronoi-bias property (Lindemann and LaValle, 2004). Showing that ${\tt dRRT}$ exhibits Voronoi bias is slightly more involved compared to the basic ${\tt RRT}$ . This is illustrated in Figure 4. To generate an edge $(V,V^{\textup{new}})$ , random sample $Q^{\textup{rand}}$ must be drawn within the Voronoi cell of $V$ , denoted as $\textup{Vor}(V)$ (Figure 4(A)) and in the general direction of $V^{\textup{new}}$ , denoted as $\textup{Vor}^{\prime}(V)$ (Figure 4(B)). The intersection of these two volumes: $\textup{Vol}(V)=\textup{Vor}(V)\cap\textup{Vor}^{\prime}(V)$ , is the volume to be sampled so as to generate $V^{\textup{new}}$ via $V^{\textup{near}}$ as shown in Figure 4.

The high-level loop of the algorithm remains similar across the method variants. The input parameter $n_{it}$ denotes how many times the tree is expanded before the algorithm checks whether a solution has already been discovered. If $n_{it}=1$ , this check is performed every iteration. If tracing the path is an expensive operation - typically it corresponds to a heuristic search process over the tensor product roadmap - then the implementer can choose to use a higher value.

5 Asymptotically Optimal Discrete RRT

This section outlines two versions of the proposed asymptotically optimal variant of the ${\tt dRRT}$ method. The first is a simple uninformed approach, which relies on the fact that to provide asymptotic optimality, it is sufficient to use a simple rewiring scheme. This simplified version will be called the asymptotically-optimal ${\tt dRRT}$ ( ${\tt ao\mbox{-}dRRT}$ ). For the sake of algorithmic efficiency however, a second, more advanced version is also proposed referred to as ${\tt dRRT^{*}}$ . To summarize the algorithmic contributions of the current work over the original ${\tt dRRT}$ :

•

${\tt dRRT^{*}}$ performs a rewiring step to refine paths in the tree, reducing costs to reach particular nodes.

•

${\tt dRRT^{*}}$ is anytime, employing branch and bound pruning after an initial solution is reached.

•

${\tt dRRT^{*}}$ promotes progress towards the goal during tree node selection.

•

${\tt dRRT^{*}}$ employs an informed expansion procedure $\mathbb{I}_{d}$ capable of using heuristic guidance.

5.1 ao- ${\tt dRRT}$

This section outlines ${\tt ao\mbox{-}dRRT}$ , an asymptotically optimal version of the ${\tt dRRT}$ algorithm which has been minimally modified to guarantee asymptotic optimality. At a high-level the approach uses a tree re-wiring technique reminiscent of $\tt RRT^{\text{*}}$ (Karaman and Frazzoli, 2011).

Algorithm 3 outlines ${\tt ao\mbox{-}dRRT}$ which iteratively expands a tree $\mathbb{T}$ over $\hat{}\mathbb{G}$ given a time budget (Algorithm 3, Line 2), performing $n_{\textrm{it}}$ consecutive calls to ${\tt Expand\_{\tt ao\mbox{-}dRRT}}$ (Lines 3, 4). Then, the method attempts to trace the path $\pi$ which connects the start $S$ with the target $T$ (Line 5). If such a path is found and is better than $\pi_{\textup{best}}$ , it replaces $\pi_{\textup{best}}$ (Lines 6, 7). $\pi_{\textup{best}}$ is returned after the time limit is reached (Line 8).

The expansion procedure for ${\tt ao\mbox{-}dRRT}$ is very similar to the original ${\tt dRRT}$ method, and is outlined in Algorithm 4. It begins by drawing a random sample in the composite configuration space (Line 1), and then finds the nearest neighbor $V^{\textup{near}}$ to this sample in the tree (Line 2). It then selects a neighbor $V^{\textup{new}}$ according to the oracle function $\mathbb{O}_{d}$ (Algorithm 5). This is the same oracle that is used in ${\tt dRRT}$ that tries to select a neighbor of $V^{\textup{near}}$ most in the direction of $Q^{\textup{rand}}$ . Then, if $V^{\textup{new}}$ is not in the tree (Line 4), it is added to the tree (Line 5) and an edge from $V^{\textup{near}}$ to $V^{\textup{new}}$ is also added (Line 6). Where this expansion step differs is that if $V^{\textup{new}}$ is already in the tree (Line 7), the method performs a rewiring step (Line 8) to check to see if the path to $V^{\textup{new}}$ is of lower cost than the existing one.

The method would be similar to ${\tt dRRT}$ in terms of the samples that constitute the tree, however ${\tt ao\mbox{-}dRRT}$ improves the solution cost with iterations and finds better solutions compared to ${\tt dRRT}$ . It is however desirable to focus the search in order find the initial solution quickly, while preserving solution quality improvement over time.

5.2 ${\tt dRRT^{*}}$

The main body of the informed ${\tt dRRT^{*}}$ algorithm is provided in Algorithm 6. The proposed method is an improvement on top of ${\tt ao\mbox{-}dRRT}$ , that preserves the asymptotic optimality while benefiting computationally from branch-and-bound pruning once a solution is found, greedy child propagations during node selection, and heuristic guidance during expansions.

The key insight behind the algorithmic improvements is the fact that by virtue of the structure of the tensor-product roadmap $\hat{}\mathbb{G}$ , there readily exists a usable heuristic measure $\mathbb{H}$ in the constituent roadmaps $\mathbb{G}_{i}$ . The shortest path on a constituent roadmap to the goal $T$ can be used as a heuristic to guide the tree. If there is no robot interaction introduced by the individual robot shortest paths, such a path comprising of the individual shortest paths is a solution that suffices. In cases of interaction between the robots, a shortest path is expected to deviate locally in regions of interaction. The best a robot can do from any constituent roadmap vertex is to follow the shortest path to the goal on the constituent roadmap. Although domain-specific heuristics can be also applied to the algorithm, it should be noted that in the currently proposed method, the purpose of the heuristic is to primarily discover an initial solution as quickly as possible. This in turn helps branch-and-bound kick in and further focuses the search once a bounding cost is ascertained from the initial solution.

At a high level, Algorithm 6 follows the structure of ${\tt ao\mbox{-}dRRT}$ . The only change is that the outer loop keeps track of the tree node being added as $V^{\textup{last}}$ and passes it on to the next call to the $\mathtt{Expand\_{\tt dRRT^{*}}}$ subroutine. The use of this information to apply heuristic guidance is detailed in the description of the function.

Algorithm 7 outlines the expansion step. The default behavior is summarized in Algorithm 7, Lines 1-3, i.e., when no $V^{\textup{last}}$ is passed as argument (Line 1). This operation corresponds to an exploration step similar to ${\tt RRT}$ , i.e., a random sample $Q^{\textup{rand}}$ is generated in $\mathbb{C}$ (Line 2) and its nearest neighbor $V^{\textup{near}}$ in $\mathbb{T}$ is found (Line 3). If the last iteration generated a node $V^{\textup{last}}$ that was closer to the goal relative to its parent, then $V^{\textup{last}}$ is provided to the function. In this case (Line 4-6) $Q^{\textup{rand}}$ is set as the target configuration $T$ , and $V^{\textup{near}}$ is selected as $V^{\textup{last}}$ . This constitutes the greedy child propagations which promotes progress towards the goal.

Informed Expansion $\mathbb{I}_{d}$ : The expansion procedure in Algorithm 8 replaces the oracle in ${\tt dRRT}$ . It switches between distinct guided and exploratory behaviors according to whether $q^{rand}_{i}$ attempts to drive the expansion towards the target $T$ or not. When the method uses heuristic guidance, among all the neighbors of $v^{\textup{near}}_{i}$ on a constituent roadmap $\mathbb{G}_{i}$ , $\mathtt{Adj}(v^{\textup{near}}_{i},\mathbb{G}_{i})$ , the one with the best heuristic measure $\mathbb{H}$ is selected. During the exploration phase, the method selects a random neighbor out of $\mathtt{Adj}(v^{\textup{near}}_{i},\mathbb{G}_{i})$ .

In either case, the oracle function $\mathbb{I}_{d}$ returns to Algorithm 7 the implicit graph node $V^{\textup{new}}$ that is a neighbor of $V^{\textup{near}}$ on the implicit graph (Line 7). Then the method finds neighbors $N$ , which are adjacent to $V^{\textup{new}}$ in $\hat{}\mathbb{G}$ and have also been added to $\mathbb{T}$ (Line 8). Among $N$ , the best node $V^{\textup{best}}$ is chosen as the node to connect $V^{\textup{new}}$ according to cost measure. Such an operation might yield no valid parent $V^{\textup{best}}$ due to collisions along $\mathbb{L}(\cdot)$ . In such a case (Line 10) the method fails to add a node during the current iteration. Line 11 implements a branch-and-bound based on the cost of the best solution so far.

Lines 12-15 recount the tree addition and parent rewire process. Lines 16-17 perform an additional rewiring step in the neighborhood if $V^{\textup{new}}$ is a better parent of any of the neighboring nodes. Line 19 switches the child promotion by checking whether $V^{\textup{new}}$ made progress toward the goal according to the heuristic measure. The method ensures that in this case $V^{\textup{new}}$ (or $V^{\textup{last}}$ ) is a child-promoted node, which would be selected during the next iteration. This effect of this behavior is that if the uncoordinated individual shortest paths are collision free, this would be greedily attempted first from the child-promoted nodes added to the tree. Evaluations indicate that this proves very effective in practice.

It should be noted that all candidate edges $\mathbb{L}(\cdot)$ in Line 9 and 17 are collision-free and for the sake of algorithmic clarity, collision checking has been assumed to be encoded into the steering function $\mathbb{L}(\cdot)$ and this is enforced during tree additions and rewires. Any specialized sampling behavior is assumed to be part of the implementation of the subroutine $\mathtt{Random\_Sample}$ .

Notes on Implementation: In the implementation, the heuristic measure $\mathbb{H}$ is efficiently calculated by precomputing all-pair shortest paths on the constituent graphs with Johnson’s algorithm (Johnson, 1977), which runs in $\mathcal{O}(|V|^{2}log|V|)$ . Precomputing the heuristic measure alleviates any overhead of spending online computation time. It is proposed that for large graphs, the vertices can be subsampled and the heuristic estimated for representative nodes that approximate the $\mathbb{H}$ value in their neighborhoods. The neighbor with the best $\mathbb{H}$ value can be computed once for a given target $T$ , and reused during the iterations inside $\mathbb{I}_{d}$ . In Algorithm 7 (Line 19), $\mathbb{H}$ refers to a heuristic estimate in the composite space. This can be deduced from the constituent spaces. The method also included additional focused random sampling (Gammell et al, 2015) once a solution is found to aid in convergence. For a fraction of the “random samples”, goal biasing samples the target state for a robot.

The set $\mathtt{Adj}(\cdot)$ returns the set of neighbors on the graph for a vertex, in addition to the vertex itself. This ensures that it is possible for a robot to stay static during an edge expansion. This means that the algorithm is also able to discover solutions where a subset of the robots must remain stationary for a period of time.

6 Analysis

This section examines the properties of ${\tt dRRT^{*}}$ starting with the asymptotic convergence of the implicit roadmap $\hat{}\mathbb{G}$ to containing a path in $\mathbb{C}^{\textup{f}}$ with optimum cost. Then, it is shown that the online search eventually discovers the shortest path in $\hat{}\mathbb{G}$ . The combination of these two facts proves the asymptotic optimality of ${\tt dRRT^{*}}$ and ${\tt ao\mbox{-}dRRT}$ .

For simplicity, the analysis considers robots operating in Euclidean space, i.e., $\mathbb{C}_{i}$ is a $d$ -dimensional Euclidean hypercube $[0,1]^{d}$ for fixed $d\geq 2$ . Robots are assumed to have the same number of degrees of freedom $d$ . The results can relate to a large class of systems, which are locally Euclidean (see, Dobson and Bekris (2013)). This is applicable to all the systems under consideration in the paper, including manipulators, with bounded angular degrees of freedom. Analysis of systems, which are not locally Euclidean, requires additional rigor especially regarding the definition of the cost metric. The Discussion section includes a description of a possible extension of the presented analysis to non-holonomic systems. It is acknowledged that the arguments presented in the current section will not readily transfer to such systems.

6.1 Optimal Convergence of $\hat{}\mathbb{G}$

In this section we prove that when the connection radius $r(n)$ 222In the graphs considered here, an edge exists between two nodes, if the nodes are separated by a distance less than the connection radius $r(n)$ . used for the construction of the single-robot PRM roadmaps $\mathbb{G}_{1},\ldots,\mathbb{G}_{R}$ is chosen in a certain manner, this yields a tensor-product graph $\hat{}\mathbb{G}$ , which contains asymptotically optimal paths for MMP.

Definition 3

A trajectory $\Sigma:[0,1]\rightarrow\mathbb{C}^{\textup{f}}$ is robust if there exists a fixed $\delta>0$ such that for every $\tau\in[0,1],X\in\mathbb{C}^{\textup{o}}$ it holds that $\|\Sigma(\tau)-X\|\geq\delta$ , where $\|\cdot\|$ denotes the standard Euclidean distance.

Definition 4

Let cost be one of the cost functions defined in Section 3. A value $c>0$ is robust if for every fixed $\epsilon>0$ , there exists a robust path $\Sigma$ , such that $\textup{cost}(\Sigma)\leq(1+\epsilon)c$ . The robust optimum $c^{*}$ , is the infimum over all such values.

For any fixed $n\in\mathbb{N}^{+}$ , and a specific instance of $\hat{}\mathbb{G}$ constructed from $R$ roadmaps, having $n$ samples each, denote by $\Sigma^{(n)}$ the lowest-cost path (with respect to $\textup{cost}(\cdot)$ ) from $S$ to $T$ over $\hat{}\mathbb{G}$ .

Definition 5

$\hat{}\mathbb{G}$ is asymptotically optimal (AO) if for every fixed $\epsilon>0$ it holds that $\textup{cost}(\Sigma^{(n)})\leq(1+\epsilon)c^{*}$ asymptotically almost surely333Let $A_{1},A_{2}\ldots$ be random variables in some probability space and let $B$ be an event depending on $A_{n}$ . $B$ occurs asymptotically almost surely (a.a.s.) if $\lim\limits_{n\rightarrow\infty}\Pr[B(A_{n})]=1$ ., where the probability is over all the instantiations of $\hat{}\mathbb{G}$ with $n$ samples for each PRM.

Using this definition, the following theorem is proven. Recall that $d$ denotes the dimension of a single-robot configuration space.

Theorem 6.1

$\hat{}\mathbb{G}$ * is AO when*

[TABLE]

where $\gamma=(1+\eta)2\left(\frac{1}{d}\right)^{\frac{1}{d}}\left(\frac{\mu(\mathbb{C}^{\textup{f}})}{\zeta_{d}}\right)^{\frac{1}{d}}$ where $\eta$ is any constant larger than [math], $\mu$ is the volume measure and $\zeta_{d}$ is the volume of an unit hyperball in $\mathbb{R}^{d}$ .

Since the method deals with solving the problem of finding a robust optimum solution, some $\epsilon>0$ is fixed. By definition of the problem, there exists a robust trajectory $\Sigma:[0,1]\rightarrow\mathbb{C}^{\textup{f}}$ , and a fixed $\delta>0$ , such that $\textup{cost}(\Sigma)\leq(1+\epsilon)c^{*}$ . Additionally for every $X\in\mathbb{C}^{\textup{o}},\ \tau\in[0,1]$ it holds that $\|\Sigma(\tau)-X\|\geq\delta$ .

If it can be shown that $\hat{}\mathbb{G}$ contains a trajectory $\Sigma^{(n)}$ , such that444The small-o notation $o(1)$ indicates a function that becomes smaller than any positive constant and thereby asymptotically will become negligible. When this relation holds, the positive constant corresponds to $\epsilon$ .:

[TABLE]

a.a.s., this would imply that $\textup{cost}(\Sigma^{(n)})\leq(1+\epsilon)c^{*}$ , proving Theorem 6.1.

As a first step, it will be shown that the robustness of $\Sigma=(\sigma_{1},\ldots,\sigma_{R})$ in the composite space implies robustness in the single-robot setting, i.e., robustness along $\sigma_{i}$ .

For $\tau\in[0,1]$ define the forbidden space parameterized by $\tau$ as

[TABLE]

Claim 1

For every robot $i$ , $\tau\in[0,1]$ , and $q_{i}\in\mathbb{C}^{\textup{o}}_{i}(\tau)$ , $\|\sigma_{i}(\tau)-q_{i}\|\geq\delta$ , i.e., the robustness of $\Sigma=(\sigma_{1},\ldots,\sigma_{R})$ in the composite space implies robustness over all single-robot paths $\sigma_{i}$ .

Proof

Fix a robot $i$ , and fix some $\tau\in[0,1]$ and a configuration $q_{i}\in\mathbb{C}^{\textup{o}}_{i}(\tau)$ . Next, define the following composite configuration

[TABLE]

Note that it differs from $\Sigma(\tau)$ only in the $i$ -th robot’s configuration. By the robustness of $\Sigma$ it follows that

[TABLE]

The result of Claim 1 is that the paths $\sigma_{1},\dots,\sigma_{R}$ are robust in their individual spaces w.r.t the parameterized forbidden space $\mathbb{C}^{\textup{o}}_{i}(\tau)$ . This means that there is sufficient clearance for the individual robots to not collide with each other given a fixed location of a single robot.

Next, a Lemma is derived using proof techniques from the literature (Janson et al, 2015, Theorem 4.1), and it implies every $\mathbb{G}_{i}$ contains a single-robot path $\sigma_{i}^{(n)}$ that converges to $\sigma_{i}$ .

Lemma 1

For every robot $i$ , let $\mathbb{G}_{i}$ be constructed with $n$ samples and a connection radius $r(n)\geq r^{*}(n)$ . Then it contains a path $\sigma_{i}^{(n)}$ with the following attributes a.a.s.:

(i)

$\sigma_{i}^{(n)}(0)=s_{i}$ , $\sigma_{i}^{(n)}(1)=t_{i}$ ;

(ii)

$\|\sigma_{i}^{(n)}\|\leq(1+o(1))\|\sigma_{i}\|$ ;

(iii)

$\forall q\in\textup{Im}(\sigma_{i}^{(n)})$ , $\exists\tau\in[0,1]$ s.t. $\|q-\sigma_{i}(\tau)\|\leq r^{*}(n)$ , where Im $(\cdot)$ is the function image.

Proof

The first two properties of Lemma (i) and (ii) restate (Janson et al, 2015, Theorem 4.1), which is applicable to the setup of this work. The last property (iii) is an immediate corollary of the first two: due to the fact that $\sigma_{i}^{(n)}$ is obtained from $\mathbb{G}_{i}$ , every point along the path is either a vertex of the graph, or lies on a straight-line path (i.e., an edge) between two vertices, whose length is at most $r^{*}(n)$ .

To complete the proof of Theorem 6.1 it remains to be shown that the combination of $\sigma_{1}^{(n)},\ldots,\sigma_{R}^{(n)}$ yields the trajectory, $\Sigma^{(n)}$ of a desired cost, i.e., one that conforms to Equation 1. The bound derived in Lemma 1 (ii) looks like what we need for proving Theorem 6.1. Even though a similar bound exists in the individual spaces, it needs to be shown that Equation 1 holds for a cost function in $\mathbb{C}^{\textup{f}}$ . We proceed to show this individually for the different cost functions.

6.1.1 Optimal Convergence for a Linear combination of Euclidean arc lengths

Lemma 2

Given Lemma 1 (ii), Equation 1 holds for a cost function $\textup{cost}(\cdot)$ that is a linear combination of Euclidean arc lengths

Proof

Here consider the case that $\textup{cost}(\Sigma)=\sum_{i=1}^{R}\|\sigma_{i}\|$ , which can also be easily modified for $\max_{i=1:R}\|\sigma_{i}\|$ or some arbitrary linear combination of the arc lengths.

In particular, define $\Sigma^{(n)}=(\sigma_{1}^{(n)},\ldots,\sigma_{R}^{(n)})$ , where $\sigma_{i}^{(n)}$ are obtained from Lemma 1. Then

[TABLE]

Lemma 3

A path $\Sigma^{(n)}=(\sigma_{1}^{(n)}\ldots\sigma_{R}^{(n)})$ exists, that satisfies the properties of Lemma 1, and is collision free both in terms of robot-obstacle and robot-robot.

Proof

Every constituent roadmap $\mathbb{G}_{i}$ of $n$ samples is constructed to satisfy Lemma 1 and contains individual robot paths $\sigma_{i}^{(n)}$ . $\hat{}\mathbb{G}$ defines the tensor-product graph in the composite configuration space $\mathbb{C}$ . The path $\Sigma^{(n)}$ is a combination of the individual robot paths $\sigma_{i}^{(n)}$ . Lemma 1 implies that $\hat{}\mathbb{G}$ contains a path $\Sigma^{(n)}$ in $\mathbb{C}$ , that represents collision-free motions relative to obstacles, and minimizes the cost function. Nevertheless, it is not clear whether this ensures the existence of a path where robot-robot collisions are avoided. That is, although $\textup{Im}(\sigma^{(n)}_{i})\subset\mathbb{C}^{\textup{f}}_{i}$ , it might be the case that $\textup{Im}(\Sigma^{(n)})\cap\mathbb{C}^{\textup{o}}\neq\emptyset$ . Next, it is shown that $\sigma_{1}^{(n)},\ldots,\sigma_{R}^{(n)}$ can be reparametrized to induce a composite-space path whose image is fully contained in $\mathbb{C}^{\textup{f}}$ , with length equivalent to $\Sigma^{(n)}$ .

For each robot $i$ , denote by $V_{i}=(v_{i}^{1},\ldots,v_{i}^{\ell_{i}})$ the chain of $\mathbb{G}_{i}$ vertices traversed by $\sigma^{(n)}_{i}$ . For every $v_{i}^{j}\in V_{i}$ assign a timestamp $\tau_{i}^{j}$ of the closest configuration along $\sigma_{i}$ , i.e.,

[TABLE]

Also, define $\mathcal{T}_{i}=(\tau_{i}^{1},\ldots,\tau_{i}^{\ell_{i}})$ and denote by $\mathcal{T}$ the ordered list of $\bigcup_{i=1}^{R}\mathcal{T}_{i}$ , according to the timestamp values. Now, for every $i$ , define a global timestamp function $T\!S_{i}:\mathcal{T}\rightarrow V_{i}$ , which assigns to each global timestamp in $\mathcal{T}$ a single-robot configuration from $V_{i}$ . It thus specifies in which vertex robot $i$ resides at time $\tau\in\mathcal{T}$ . For $\tau\in\mathcal{T}$ , let $j$ be the largest index, such that $\tau_{i}^{j}\leq\tau$ . Then simply assign $T\!S_{i}(\tau)=\tau_{i}^{j}$ . From property (iii) in Lemma 1 and Claim 1 it follows that no robot-robot collisions are induced by the reparametrization. This concludes the proof of Theorem 6.1.

6.1.2 Optimal Convergence for Euclidean arc length

Arguments for convergence of the cost of the solution in terms of the Euclidean arc length of the composite path $\Sigma$ can be made to extend the results of Lemma 2. A robot having $d$ DoFs, $(F_{1},\dots F_{d})$ exists in an $\mathbb{R}^{d}$ space. The motion of the robot constitutes a curve in $\mathbb{R}^{d}$ , defined as

[TABLE]

where $f_{i}(t)$ is the coordinate function of the curve $\Sigma$ along the DoF $F_{i}$ , where $t\in[0,1]$ and $i\in[1,d]$ . The Euclidean arc-length of the path in $\mathbb{R}^{d}$ :

[TABLE]

The coordinate function is assumed to be continuous with respect to the Lebesgue measure on $t$ and of bounded variation. The Lebesgue measure assigns a measure to subsets of $n$ -dimensional Euclidean space. For $n=1,2,$ or $3$ , it coincides with the standard measure of length, area, or volume. The assumption here is that the curve is Lebesgue integrable over the coordinate function. This relates to the variation of the curve being smooth for the parametrization, over subsets of the curve that correspond to the Lebesgue measure over $t$ . $\Sigma$ is also a rectifiable curve, i.e., the curve has finite length.

Definition 6

The partial arc length is defined as,

[TABLE]

By definition and assuming smoothness and bounded variation (Pelling, 1977), for $t=0$ to $t=x\leq 1:$

[TABLE]

The value is the supremum over all possible finite partitions $P:0=t_{0}<\dots<t_{m}=x$ , that can divide $t$ . This generates a finite set of $m$ parts.

We denote the value of $s(x)$ for some partition $P$ as:

[TABLE]

A part shall refer to the curve between $\Sigma(P(j-1))$ and $\Sigma(P(j))$ . The measure of the part would be

[TABLE]

Let $P^{*}$ be the finite supremum partitioning over $t$ , that has $m$ parts. This means, $s(x)=s(x)_{P^{*}}$ . Without loss of generality, let us assume that $P^{*}$ corresponds to the supremum partitioning that has the least number of parts, $|P^{*}|=m+1$ , i.e., there are no degenerate partitions. A finer partition can introduce additional parameterization over $t$ , and hence is a superset of $P^{*}$ , but cannot increase the value of $s(x)$ since $P^{*}$ is the supremum. Note that a partition sequence with $m$ parts will have a cardinality of $m+1$ . The finer partition has all these $m+1$ parametrization values in addition to others.. Since $s(x)$ is finite and $m$ is finite, $\exists P^{*}\ |\ \mu_{j-1,j}^{P^{*}}\in\mathbb{R}_{+}\ \forall\ j\in t$ .

Claim 2

Given a finite set of paths $\xi$ , there exists a finer partitioning, $Q^{*}$ for $P^{*}$ over each $\Sigma\in\xi$ , which yields $L$ number of equal measure parts (Figure 6) for every $\Sigma\in\xi$ .

Proof

[TABLE]

This holds true for every $\Sigma$ for a corresponding $Q^{*}\supseteq P^{*}$ . The measure of every part $\Sigma(l)$ is equal, and is denoted by $\|\Sigma(l)\|=\frac{\|\Sigma\|}{L}$ . This simplifies Equation 5 to $\|\Sigma\|=\sum_{l=1}^{L}\|\Sigma(l)\|$ .

Claim 3

Additionally, by this simplification, Equation 5 in the composite space is restated for $\Sigma_{Rd}=\{\Sigma_{1}\ldots\Sigma_{R}\}$ where $\Sigma_{Rd},\Sigma_{1}\ldots\Sigma_{R}\in\xi$ .

Proof

In the multi-robot space Euclidean space $\mathbb{R}^{Rd}$ , the arc length in the composite space can be expressed in terms of the arc lengths traversed in the individual robot spaces.

[TABLE]

where, with a slight abuse of notation $\delta f_{i}$ is a shorthand representation for some $f_{i}(Q^{*}(j))-f_{i}(Q^{*}({j-1}))$ .

Lemma 4

For a $\Sigma^{(n)}=(\sigma_{1}^{(n)}\ldots\sigma_{R}^{(n)})$ , where $\sigma_{i}^{(n)}$ is obtained from Lemma 1, given that $\|\sigma_{i}^{(n)}\|\leq(1+o(1))\|\sigma_{i}\|$ , Equation 1 holds for the Euclidean arc lengths.

Proof

Partitioning the arcs $\sigma_{i}^{(n)}$ , and $\sigma_{i}$ , into $L$ (chosen as per Claim 2) pieces of equal length, yields two trajectory sequences, for $\ l\in N_{+},l\leq L$ .

The high level idea is that leveraging the uniformity in the parameterized parts introduced by $L$ , Lemma 1(ii) has to be recombined to represent the Euclidean arc length in the composite space.

[TABLE]

Following the same parameterization described in Lemma 3, Theorem 6.1 can be shown for the Euclidean metric as well.

6.2 Asymptotic Optimality of ${\tt dRRT^{*}}$

Finally, ${\tt dRRT^{*}}$ is shown to be AO. Denote by $m$ the time budget in Algorithm 6, i.e., the number of iterations of the loop. Denote by $\Sigma^{(n,m)}$ the solution returned by ${\tt dRRT^{*}}$ for $n$ samples in the individual constituent roadmaps and $m$ iterations of the ${\tt dRRT^{*}}$ algorithm.

Theorem 6.2

If $r(n)>r^{*}(n)$ then for every fixed $\epsilon>0$ it holds that

[TABLE]

Since $\hat{}\mathbb{G}$ is AO (Theorem 6.1), it suffices to show that for any fixed $n$ , and a fixed instance of $\hat{}\mathbb{G}$ , defined over $R$ PRMs with $n$ samples each, ${\tt dRRT^{*}}$ eventually (as $m$ tends to infinity), finds the optimal trajectory over $\hat{}\mathbb{G}$ . This property is stated in Lemma 5 and proven subsequently. The same arguments hold for both ${\tt dRRT^{*}}$ and ${\tt ao\mbox{-}dRRT}$ , with the difference highlighted explicitly in the proof.

Lemma 5 (Optimal Tree Convergence of ${\tt dRRT^{*}}$ )

Consider an arbitrary optimal path $\pi^{*}$ originating from $v_{0}$ and ending at $v_{t}$ , then let $O^{(m)}_{k}$ be the event such that after $m$ iterations of ${\tt dRRT^{*}}$ , the search tree $\mathbb{T}$ contains the optimal path up to segment $k$ . Then,

[TABLE]

Proof

This is shown using Markov chain results (Grinstead and Snell, 2012, Theorem 11.3). Specifically, absorbing Markov chains can be leveraged to show that ${\tt dRRT^{*}}$ will eventually contain the optimal path over $\hat{}\mathbb{G}$ . An absorbing Markov chain has some subset of its states in which the transition matrix only allows self-transitions.

The proof follows by showing that the ${\tt dRRT^{*}}$ method can be described as an absorbing Markov chain, where the target state of a query is represented as an absorbing state in a Markov chain. For completeness, the theorem is re-stated here.

Theorem 6.3 (Thm 11.3 in Grinstead & Snell)

In an absorbing Markov chain, the probability that the process will be absorbed is 1 (i.e., $Q(m)\to 0$ as $n\to\infty$ ), where $Q(m)$ is the transition submatrix for all non-absorbing states.

The first part is that the ${\tt dRRT^{*}}$ search is cast as an absorbing Markov chain, and second, that the transition probability from each state to the next is nonzero, i.e., each state eventually connects to the target.

For query $(S,T)$ , let the sequence $V=\{v_{1},v_{2},\dots,v_{\textup{t}}\}$ of length $t$ represent the vertices of $\hat{}\mathbb{G}$ corresponding to the optimal path through the graph which connects these points, where $v_{\textup{t}}$ corresponds to the target vertex, and furthermore, let $v_{\textup{t}}$ be an absorbing state. Theorem 6.3 works under the assumption that each vertex $v_{\textup{i}}$ is connected to an absorbing state $v_{\textup{t}}$ .

Then, let the transition probability for each state have two values, one for each state transitioning to itself, which corresponds to the ${\tt dRRT^{*}}$ search expanding along some other arbitrary path. The other value is a transition probability from $v_{\textup{i}}$ to $v_{\textup{i}+1}$ . This corresponds to two slightly different cases for ${\tt ao\mbox{-}dRRT}$ and ${\tt dRRT^{*}}$ .

Case ${\tt ao\mbox{-}dRRT}$ : The transition probability from $v_{\textup{i}}$ to $v_{\textup{i}+1}$ corresponds to the method sampling within the volume $\textup{Vol}(v_{\textup{i}})$ . Then, as the second step, it must be shown that this volume has a positive probability of being sampled in each iteration. It is sufficient then to argue that $\frac{\mu(\textup{Vol}(s_{\textup{i}}))}{\mu(\mathbb{C}^{\textup{f}})}>0$ . Fortunately, for any finite $n$ , previous work has already shown that this is the case given general position assumptions (Solovey et al, 2015a, Lemma 2).

Case ${\tt dRRT^{*}}$ : In the case of ${\tt dRRT^{*}}$ due to the random neighborhood selection in the expansion $\mathbb{I}_{d}$ , there is a positive transition probability from $v_{\textup{i}}$ to $v_{\textup{i}+1}$ .

Given these results, the ${\tt dRRT^{*}}$ is cast as an absorbing Markov chain, which satisfies the assumptions of Theorem 6.3, and therefore, the matrix $Q(m)\to 0$ . This implies that the optimal path to the goal has been expanded in the tree, and therefore $\liminf_{m\to\infty}\mathbb{P}\big{(}O^{(m)}_{t}\big{)}=1.$

7 Extension to Shared Degrees of Freedom

This section describes an extension of the ${\tt dRRT^{*}}$ approach to systems with shared degrees of freedom (DoF), with specific focus on humanoid robots with two arms. The challenge here arises because of the high dimensionality of the robots. The shared DoF is a general formulation, which can refer to either degrees of freedom in a torso or a mobile base etc.

This section is structured in the same way as the rest of the algorithmic descriptions, and a lot of the shared notations and details are omitted for the sake of brevity. Instead, the interesting insights into the problems that arise due to the shared DoF are highlighted, and resolved. A high level overview of the differences of dual-arm ${\tt dRRT^{*}}$ ( $\tt da\mbox{-}dRRT^{*}$ ) from the previously stated methods includes:

•

$\tt da\mbox{-}dRRT^{*}$ decomposes the space by grouping the shared DoF with one of the arms.

•

$\tt da\mbox{-}dRRT^{*}$ implicitly builds two trees online that explores two tensor roadmaps.

•

$\tt da\mbox{-}dRRT^{*}$ needs additional arguments for proving robustness in Claim 1.

The current work does not get into aspects related to manipulation. Nevertheless, the primitives designed here can speed up dual-arm manipulation task planning, where computational benefits can be achieved by operating over multiple roadmaps (Gravot et al, 2002; Gravot and Alami, 2003b). The topology of dual-arm manipulation has been formalized (Koga and Latombe, 1994; Harada et al, 2014) and extended to the $N$ -arm case (Dobson and Bekris, 2015). It requires the consideration of multi-robot grasp planning (Vahrenkamp et al, 2010; Dogar et al, 2015), regrasping (Vahrenkamp et al, 2009), as well as closed kinematic chain constraints (Cortés and Siméon, 2005; Bonilla et al, 2017). Furthermore, force control strategies are helpful for multi-arm manipulation of a common object (Caccavale and Uchiyama, 2008). Recently coordinated control has been applied to solve human-robot interaction tasks (Sina Mirrazavi Salehian et al, 2016).

The algorithm is meant to address the applicability of ${\tt dRRT^{*}}$ to high dimensional humanoid robots with shared DoF.

7.1 Problem Setup and Notation

As shown in Fig. 7, the DoF $[F_{1},\ldots,F_{d}]$ can be grouped into left, right, and shared DoF subsets, so that: $\mathbb{C}=\mathbb{C}_{\textrm{{l}}}\times\mathbb{C}_{\textrm{{s}}}\times\mathbb{C}_{\textrm{{r}}}$ . A candidate solution path $\Sigma:[0,1]\rightarrow\mathbb{C}^{\textup{f}}$ can be decomposed to projections $[\Sigma_{l},\Sigma_{s},\Sigma_{r}]$ along $\mathbb{C}_{\textrm{{l}}}$ , $\mathbb{C}_{\textrm{{s}}}$ and $\mathbb{C}_{\textrm{{r}}}$ respectively.

The method proposes the construction of the following roadmaps, as shown in Fig. 8:

•

A left-shared $\mathbb{R}_{ls}(\mathbb{V}_{ls},\mathbb{E}_{ls})$ and a right-shared DoF roadmap $\mathbb{R}_{sr}(\mathbb{V}_{sr},\mathbb{E}_{sr})$ , where $\mathbb{V}_{ls}\subset\mathbb{C}_{\textrm{{l}}}\times\mathbb{C}_{\textrm{{s}}}$ and $\mathbb{V}_{sr}\subset\mathbb{C}_{\textrm{{s}}}\times\mathbb{C}_{\textrm{{r}}}$ . The edges are collision-free paths in the same spaces, i.e., no collisions with the static geometry, or self-collisions among the arm or the shared DoFs.

•

A left arm $\mathbb{P}_{\textrm{l}}(\mathbb{V}_{l},\mathbb{E}_{l})$ and a right arm roadmap $\mathbb{P}_{\textrm{r}}(\mathbb{V}_{r},\mathbb{E}_{r})$ , such that $\mathbb{V}_{l}\subset\mathbb{C}_{\textrm{{l}}}$ , and $\mathbb{V}_{r}\subset\mathbb{C}_{\textrm{{r}}}$ . These roadmaps do not consider the static geometry as they are not grounded by the shared DoFs. So, only self-collisions between arm links are avoided.

The method focuses on two tensor product roadmaps: $\hat{}\mathbb{G}_{l}=\mathbb{R}_{ls}\times\mathbb{P}_{r}$ , and $\hat{}\mathbb{G}_{r}=\mathbb{R}_{sr}\times\mathbb{P}_{l}$ . The method then simultaneously searches over $\hat{}\mathbb{G}_{l}$ and $\hat{}\mathbb{G}_{r}$ in a ${\tt dRRT^{*}}$ -esque fashion.

7.2 Methodology

This section describes the proposed method, and the way the $\tt da\mbox{-}dRRT^{*}$ builds a forest of two trees $\mathbb{T}$ , which explores both $\hat{}\mathbb{G}_{l}$ and $\hat{}\mathbb{G}_{r}$ . In terms of the method’s properties it is sufficient to consider only one roadmap, but in practice, exploring them simultaneously helps in the convergence, since we can evaluate more possible solutions and rewires. The approach shows faster convergence compared to $\tt RRT^{\text{*}}$ in $\mathbb{C}$ , and scales more than ${\tt PRM^{*}}$ .

At a high-level, the proposed Dual-arm ${\tt dRRT^{*}}$ ( $\tt da\mbox{-}dRRT^{*}$ ) simultaneously explores the tensor product roadmaps $\hat{}\mathbb{G}_{l}$ and $\hat{}\mathbb{G}_{r}$ , by building a search tree for each one so as to find a solution from the start configuration $S$ to the target configuration $T$ . For every vertex, the algorithm keeps track from which tensor product roadmap the vertex belongs to. Upon initialization, the tree starts with two vertices, $S_{l}$ and $S_{r}$ , one corresponding to tensor product roadmap $\hat{}\mathbb{G}_{l}$ and the other to $\hat{}\mathbb{G}_{r}$ . Then, at every iteration, the tree data structure $\mathbb{T}$ is expanded by adding a new edge and a node by calling an expand subroutine like Algorithm 7. The differences arises in the neighborhood calculation in Algorithm 7 Line 8. The neighborhood $N$ for $V^{\textup{new}}$ considers the tensor roadmap neighborhoods that are part of the tree for both roadmaps. $V^{\textup{new}}$ belongs to to either $\hat{}\mathbb{G}_{l}$ or $\hat{}\mathbb{G}_{r}$ . $\hat{V^{\textup{new}}}$ is chosen to be the nearest tree vertex that was generated on the other tensor roadmap. $N$ is the set of all tree vertices that are tensor roadmap neighbors of $V^{\textup{new}}$ or $\hat{V^{\textup{new}}}$ . While doing rewires, care is taken to only rewire nodes belonging to the same tensor roadmap. The consideration of a richer neighborhood lets the algorithm ensure adequate exploration of both tensor roadmaps. The informed oracle $\mathbb{I}_{d}$ is similar to Algorithm 8 with the difference arising for the constituent roadmap $\mathbb{P}$ , where the $\mathbb{H}$ estimate is simply the shortest Euclidean distance to the goal.

Notes on Efficiency: The difference of the decomposition for shared degrees of freedom compared to ${\tt dRRT^{*}}$ is that $\mathbb{G}=\mathbb{R}\times\mathbb{P}$ does not give two kinematically independent spaces. Specifically, $\mathbb{P}$ depends on the shared ${\tt DoF}$ to be grounded to the frame of the robot. This means that the heuristic $\mathbb{H}$ is less informed for $\mathbb{P}$ and can only use the straight line distance. The ${\tt dRRT^{*}}$ algorithm does not work out of the box in the case of robots with shared degrees of freedom. The effect of the less expressive heuristic in $\tt da\mbox{-}dRRT^{*}$ , translates into some degradation in performance relative to the case of two kinematically independent robotic arms. Nevertheless, $\tt da\mbox{-}dRRT^{*}$ is still significantly faster than operating directly in the composite space of the entire robot. There are not many methods that can practically compute solutions for such high-dimensional (e.g., 15 degrees of freedom) systems with kinematic dependences. The proposed $\tt da\mbox{-}dRRT^{*}$ method preserves some of the scalability benefits of $\tt da\mbox{-}dRRT^{*}$ and addresses the kinematic dependence that arises for many popular humanoid robots.

7.3 Analysis

**Asymptotic optimality of tensor roadmaps : ** Given the decomposition, $\mathbb{C}$ is divided into two parts: $\mathbb{R}_{\textrm{ls}}$ and $\mathbb{P}_{\textrm{r}}$ . If a robust optimal path $\Sigma^{*}$ exists in $\mathbb{C}$ , most of the arguments of Section 6 still hold for this decomposition. Due to the nature of the space decomposition, since the constituent spaces do not correspond to kinematically independent robots, the clearance assumption in Claim 1 needs to be reworked.

Claim 4

Robustness in $\mathbb{C}$ implies robustness in $\mathbb{C}_{ls}$ and $\mathbb{C}_{r}$ . For every decomposition, $\tau\in[0,1]$ , and $q_{i}\in\mathbb{C}^{\textup{o}}_{i}(\tau)$ , $\|\sigma_{i}(\tau)-q_{i}\|_{2}\geq\delta$ .

Proof

Consider any $Q=(\Sigma_{ls}(\tau),q_{r})$ , where $q_{r}$ is a configuration in $\mathbb{C}_{r}$ so that the right arm collides either with the static geometry or with the left-shared part of the robot, which is at $\sigma_{ls}(\tau)$ . Given a robust $\Sigma$ , $Q$ is a colliding configuration: $\delta\leq\|\Sigma(\tau)-Q\|$ . But $Q$ and $\Sigma(\tau)$ only differ in $q_{r}$ , so the path $\sigma_{r}$ has clearance $\delta$

[TABLE]

By switching the decomposition of $\Sigma$ in $\hat{}\mathbb{G}_{l}$ into $(\sigma_{l},\sigma_{sr})$ , by the above reasoning:

[TABLE]

This proves the robustness for $\sigma_{ls}$ . The same reasoning can be applied to $\mathbb{C}_{sr}$ and $\mathbb{C}_{l}$ .

It suffices to follow the proof structures outlined in Section 6 to argue asymptotic optimality for the method. It should be noted that due to the coupled nature of $\mathbb{C}$ introduced by the shared ${\tt DoF}$ , the use of the Euclidean cost metric is more applicable.

8 Experimental Validation

This section provides an experimental evaluation of ${\tt dRRT^{*}}$ by demonstrating practical convergence, scalability for disk robots, and applicability to dual-arm manipulation. The choice of a cost metric depends on the type of application and the underlying system properties. For systems without shared degrees of freedom, the considered cost function is the sum of individual Euclidean arc lengths, which is a popular choice for multi-robot systems. For systems with shared degrees of freedom, the combined nature of the underlying configuration space motivates the use of Euclidean arc length in the composite space as the metric. The results show that the properties and benefits of the proposed algorithms stay robust for both choices of cost functions.

8.1 Tests on Systems without Shared DoF

The approach and alternatives are executed on a cluster with Intel(R) Xeon(R) CPU E5-4650 @ 2.70GHz processors, and 128GB of RAM. The solution costs are evaluated in terms of the sum of Euclidean arc lengths.

2 Disk Robots among 2D Polygons: This base-case test involves $2$ disks ( $\mathbb{C}_{i}:=\mathbb{R}^{2}$ ) of radius $0.2$ with bounded velocity, in a $10\times 10$ region, inflated by the radius, as in Figure 9. The disks have to swap positions between $(0,0)$ and $(9,9)$ . This is a setup where it is possible to compute the explicit roadmap, which is not practical in more involved scenarios. In particular, ${\tt dRRT^{*}}$ is tested against: a) running ${\tt A^{\text{*}}}$ on the implicit tensor roadmap $\hat{}\mathbb{G}$ (referred to as “Implicit $\tt A^{\text{*}}$ ”), where $\hat{}\mathbb{G}$ is defined over the same individual roadmaps with $N$ nodes as those used by ${\tt dRRT^{*}}$ ; b) an explicitly constructed ${\tt PRM^{*}}$ roadmap with $N^{2}$ nodes in $\mathbb{C}$ ; and c) the ${\tt ao\mbox{-}dRRT}$ variant of the algorithm.

Results are shown in Figure 10. ${\tt dRRT^{*}}$ converges to the optimal path over $\hat{}\mathbb{G}$ , similar to the one discovered by Implicit $\tt A^{\text{*}}$ , while quickly finding an initial solution of high quality. Furthermore, the implicit tensor product roadmap $\hat{}\mathbb{G}$ is of comparable quality to the explicitly constructed roadmap. The convergence of ${\tt dRRT^{*}}$ is faster compared to corresponding ${\tt ao\mbox{-}dRRT}$ variant as evident from Figure 10(left).

Table 1 presents running times. ${\tt dRRT^{*}}$ and implicit $\tt A^{\text{*}}$ construct $2$ $N$ -sized roadmaps (row 3), which are faster to construct than the ${\tt PRM^{*}}$ roadmap in $\mathbb{C}$ (row 1). ${\tt PRM^{*}}$ becomes very costly as $N$ increases. For $N=500$ , the explicit roadmap contains $250,000$ vertices, taking $1.7$ GB of RAM to store, which was the upper limit for the machine used. When the roadmap can be constructed, it is fast to query (row 2). ${\tt dRRT^{*}}$ quickly returns an initial solution (row 5), at par with the solution times from the explicit roadmap and well before Implicit $\tt A^{\text{*}}$ returns a solution (row 4). The initial solution times are compared visually in Figure 10 which demonstrates the efficiency of ${\tt dRRT^{*}}$ compared to ${\tt ao\mbox{-}dRRT}$ as well. The next benchmark further emphasizes this point.

The comparison between the early solution time required to find a suboptimal solution by the proposed method against the computation time needed by the optimal $\tt A^{\text{*}}$ highlights the impact of roadmaps of increasing sizes. While ${\tt dRRT^{*}}$ ’s initial solution times barely change, the time taken by any variant of heuristic search over the composite roadmap increases with the size of the roadmap. This indicates that roadmaps of size similar to the tensor roadmaps considered here would rapidly cease to be solvable without anytime performance similar to that of ${\tt dRRT^{*}}$ .

Many Disk Robots among 2D Polygons: In the same environment as above, the number of robots $R$ is increased to evaluate scalability. The same environment is maintained in this benchmark to introduce additional complexity purely in terms of the addition of more robots into the planning problem. The effect of more difficult and practical planning scenarios would be explored in the subsequent benchmarks with manipulators. Each robot starts on the perimeter of the environment and is tasked with reaching the opposite side. An $N=50$ roadmap is constructed for every robot. It quickly becomes intractable to construct a ${\tt PRM^{*}}$ roadmap in the composite space of many robots.

Figure 11 shows the inability of alternatives to compete with ${\tt dRRT^{*}}$ in scalability. Solution costs are normalized by an optimistic estimate of the path cost for each case, which is the sum of the optimal solutions for each robot, disregarding robot-robot interactions. The colored vertical bars represent the range of the average initial and final solution costs. Implicit $\tt A^{\text{*}}$ fails to return solutions even for 3 robots. Directly executing $\tt RRT^{\text{*}}$ in the composite space fails to do so for $R\geq 6$ . The original ${\tt dRRT}$ method (without the informed search component) starts suffering in success ratio for $R\geq 4$ and returns worse quality solutions than ${\tt dRRT^{*}}$ . The ${\tt ao\mbox{-}dRRT}$ variant performs similar to ${\tt dRRT}$ in terms of success ratio but expectedly finds better solutions than ${\tt dRRT}$ . ${\tt dRRT^{*}}$ finds solutions up to $R=10$ .

In order to give an estimate of the immensity of the size of the search space, for $R=10$ , the tensor-product roadmap represents an implicit structure consisting of $50^{10}$ or $\mathtt{\sim}100$ million-billion vertices.

Dual-arm manipulator: This test (Figure 12) shows the benefits of ${\tt dRRT^{*}}$ when planning for two $7$ -dimensional arms. Figure 13 shows that $\tt RRT^{\text{*}}$ fails to return solutions within $100K$ iterations. Using small roadmaps is also insufficient for this problem. Both ${\tt dRRT^{*}}$ and Implicit $\tt A^{\text{*}}$ require larger roadmaps to begin succeeding. But with $N\geq 500$ , Implicit $\tt A^{\text{*}}$ always fails, while ${\tt dRRT^{*}}$ maintains a $100\%$ success ratio. As expected, roadmaps of increasing size result in higher quality path. The informed nature of ${\tt dRRT^{*}}$ also allows to find initial solutions fast, which together with the branch-and-bound primitive allows for good convergence. The initial solution times in Figure 13 indicate that the heuristic guidance succeeds in finding fast initial solutions even for larger roadmaps.

8.2 Tests on Systems with Shared DoF

This section showcases three benchmarks of increasing difficulty, which are used to evaluate the performance of the $\tt da\mbox{-}dRRT^{*}$ . All the experiments were run on a cluster with Intel(R) Xeon(R) CPU E5-4650 @ 2.70GHz processors, and 128GB of RAM. In each benchmark, different sizes $n$ of the constituent roadmaps $\mathbb{R}_{\textrm{ls}}$ and $\mathbb{R}_{\textrm{sr}}$ were evaluated. The $\tt da\mbox{-}dRRT^{*}$ algorithm is compared against $\tt RRT^{\text{*}}$ and ${\tt PRM^{*}}$ . The platforms used are ${\tt Motoman}$ SDA10F, with a torsional DoF, and ${\tt Baxter}$ on a mobile base that can rotate and translate. For the ${\tt PRM^{*}}$ algorithm and all benchmarks, $20$ randomly seeded roadmaps with $50,000$ nodes are constructed in $\mathbb{C}$ and data are gathered from $20$ experiments. A $50,000$ node roadmap has $\approx 1$ million edges, and takes $\approx 7$ hours to construct in these high dimensional spaces. Larger roadmaps run into memory scalability issues. These roadmaps in the full space occupied $\approx 50$ MB. In comparison, the space requirement for two arm roadmaps were $<1$ MB.

For all benchmarks, both $\tt RRT^{\text{*}}$ and $\tt da\mbox{-}dRRT^{*}$ were allowed to run for $100,000$ iterations. $\tt RRT^{\text{*}}$ is ran in $20$ different randomly seeded experiments for every benchmark. For the $\tt da\mbox{-}dRRT^{*}$ algorithm, $20$ experiments are run for every benchmark, for the different constituent roadmap sizes $n$ , by building $4$ pairs of randomly seeded constituent roadmaps, and running $5$ randomly seeded experiments over each roadmap combination.

Motoman Tabletop Benchmark: A set of $20$ random collision-free starts and goals are selected in the tabletop environment, shown in Fig. 14.

They are only used if they are sufficiently far away from each other. $\tt da\mbox{-}dRRT^{*}$ is tested with constituent roadmap sizes of $100,250$ and $500$ . All the algorithms succeed in every experiment. In this simpler problem, smaller roadmaps are quicker to search, and generate initial solutions faster compared to $\tt RRT^{\text{*}}$ , as shown in Fig. 15 (top).

Searching the ${\tt PRM^{*}}$ is the fastest (online), but the solution quality is worse than that obtained from the other methods. $\tt da\mbox{-}dRRT^{*}$ converges to better solutions, compared to the other algorithms, as shown in Fig 15 (bottom).

Motoman Shelf Benchmark: This benchmark sets up the ${\tt Motoman}$ in front of $3$ shelves. The robot has to plan between two states where both arms are inside different shelving units, which require the rotation of its torso (Fig. 16 (top)).

This is a significantly harder problem, and $\tt RRT^{\text{*}}$ suffers in terms of success ratio (Fig. 16 (second)). $\tt RRT^{\text{*}}$ takes much longer to find the initial solution, as indicated by Fig. 16 (middle). ${\tt PRM^{*}}$ is still the fastest in finding solutions (only online cost considered again here). The $\tt da\mbox{-}dRRT^{*}$ solution cost is much better than both the average ${\tt PRM^{*}}$ solution, and $\tt RRT^{\text{*}}$ , as shown in Fig. 16 (bottom). $\tt da\mbox{-}dRRT^{*}$ will quickly converge for smaller roadmaps, and then stop improving the cost. The larger roadmaps contain better solutions, causing $\tt da\mbox{-}dRRT^{*}$ to converge slower.

Mobile Baxter Benchmark: This benchmark uses a Rethink ${\tt Baxter}$ robot with a mobile base. The robot is grasping two long objects inside a shelf Fig. 17 (top). The robot has to navigate across a cramped, walled room, to a placing configuration inside a shelf on the other side of the room.

This proves to be the most challenging problem among the three benchmarks. As shown in Fig. 17 (middle), $\tt RRT^{\text{*}}$ fails to find a solution. It should be noted that, when tested on a simpler version of the benchmark without the pillar in the room, $\tt RRT^{\text{*}}$ could find solutions. ${\tt PRM^{*}}$ also falters by showing a very low success rate. This indicates that we need even larger roadmaps in $\mathbb{C}$ to solve harder problems. The problem is solved when a dense implicit structure, with $n=1000$ is explored by $\tt da\mbox{-}dRRT^{*}$ .

Fig. 17 (bottom) shows that $\tt da\mbox{-}dRRT^{*}$ finds better initial and converged solutions when compared to the instances in which ${\tt PRM^{*}}$ succeeded.

8.3 Real world experiments

Experiments were performed in a 28-dimensional space with two dual-armed manipulators: (a ${\tt Motoman}$ SDA10f and a ${\tt Baxter}$ ). Initial solutions were obtained in a fraction of a second for the two experimental setups, with the method allowed to run for $1000$ iterations to improve the quality of the demonstrated trajectories. The two setups are chosen carefully to demonstrate in the first instance a typical application of simultaneous grasping that may arise in real world scenarios, and in the second instance a problem that forces very close interactions between the arms in close proximity.

Pre-grasp Demonstration: As shown in Figure 18, the demonstration simulates an application to multi-arm manipulation, where the goals of the motion planning problem for 4 arms is to pre-grasping configurations for 4 objects placed on a table in the shared workspace between the robots. $1000$ node roadmaps were constructed for each arm and ${\tt dRRT^{*}}$ was used to search for a solution to the motion planning problem. The solution was computed offline and an open-loop execution was performed on the real system.

Coupled Workspace Demonstration: As shown in Figure 19, a pole is positioned between the two robots so that the arms cannot cross over. The objective is for the 4 arms to a) approach the pole at alternating heights, b) then swap the height of their approaching configurations, and c) finally return back to the start state. $1000$ node roadmaps were constructed for each arm and ${\tt dRRT^{*}}$ was used to search for a solution to the three motion planning problems. The solutions that were computed offline, were stitched together and replayed in an open loop execution on the real system.

9 Discussion

This work proves asymptotic optimality of sampling-based multi-robot planning over implicit structures by extending the ${\tt dRRT}$ approach. Asymptotic optimality is achieved by making a modification, resulting in ${\tt ao\mbox{-}dRRT}$ which expands a spanning tree over an implicitly defined tensor product roadmap, and leverages a simple re-wiring scheme.

This method already has the advantage of avoiding the construction of a large, dense roadmap in the composite configuration space of many robots. This can be further improved to use heuristics so as to search in an informed manner, in the ${\tt dRRT^{*}}$ method. The method is also extended to work with robot systems, which share degrees of freedom, resulting in $\tt da\mbox{-}dRRT^{*}$ .

Experimental results show the efficacy of the proposed approaches. Furthermore, by leveraging heuristics, ${\tt dRRT^{*}}$ is able to solve more challenging problem instances than the baseline ${\tt ao\mbox{-}dRRT}$ method, and the approach is demonstrated to solve complex, real-world problems with robot manipulators operating in a shared workspace with a high degree of coupling.

In terms of practical applicability, ${\tt dRRT^{*}}$ promises fast initial solutions times (Figures 11 and 13) on the order of a fraction of a second for most problems, including for high-dimensional, kinematically independent multi-robot problems, which is an exciting result. The solution quality improvement indicates the anytime properties of the approach, where paths of improved path quality are discovered as more computation time is invested. While problems with shared degrees of freedom provide less guidance and result in performance degradation, the scalability benefits remain even in this case relative to composite planning. Future work includes the consideration of dynamics. The existing theoretical analysis of ${\tt dRRT^{*}}$ assumes that the individual robot systems are holonomic, which guarantees the existence of near-optimal single-robot paths (see Lemma 1 and (Janson et al, 2015, Theorem 4.1)). Recent results concerning asymptotic optimality of PRM for non-holonomic systems (see Schmerling et al (2015a, b)) bring the hope of achieving a more general analysis of the current work as well. The proposed framework can also be leveraged toward efficiently solving simultaneous task and motion planning for many robot manipulators (Dobson and Bekris, 2015). The demonstrated applications to manipulators also motivate dual-arm rearrangement challenges (Shome et al, 2018).

Bibliography61

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Adler et al (2015) Adler A, De Berg M, Halperin D, Solovey K (2015) Efficient Multi-Robot Motion Planning for Unlabeled Discs in Simple Polygons. In: IEEE Transactions on Automation Science and Engineering, vol 12, Springer, pp 1309–1317, DOI 10.1109/TASE.2015.2470096 , 1312.1038
2Atias et al (2017) Atias A, Solovey K, Halperin D (2017) Effective Metrics for Multi-Robot Motion-Planning. In: Proceedings of Robotics: Science and Systems, Cambridge, Massachusetts, DOI 10.1177/0278364918784660 , 1705.10300
3Bonilla et al (2017) Bonilla M, Pallottino L, Bicchi A (2017) Noninteracting constrained motion planning and control for robot manipulators. In: IEEE International Conference on Robotics and Automation, pp 4038–4043, DOI 10.1109/ICRA.2017.7989463
4Caccavale and Uchiyama (2008) Caccavale F, Uchiyama M (2008) Cooperative Manipulators. In: Siciliano B, Khatib O (eds) Springer Handbook of Robotics, Springer, pp 701–718, DOI 10.1007/978-3-540-30301-5˙30
5Canny (1988) Canny JF (1988) The Complexity of Robot Motion Planning, vol Doctoral D. MIT press, DOI 10.1016/j.scriptamat.2009.11.029
6Cortés and Siméon (2005) Cortés J, Siméon T (2005) Sampling-Based Motion Planning under Kinematic Loop-Closure Constraints. In: Workshop on the Algorithmic Foundations of Robotics, pp 75–90, DOI 10.1007/10991541˙7
7Dobson and Bekris (2013) Dobson A, Bekris KE (2013) A study on the finite-time near-optimality properties of sampling-based motion planners. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 1236–1241, DOI 10.1109/IROS.2013.6696508
8Dobson and Bekris (2015) Dobson A, Bekris KE (2015) Planning representations and algorithms for prehensile multi-arm manipulation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, vol 2015-Decem, pp 6381–6386, DOI 10.1109/IROS.2015.7354289

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

dRRT*: Scalable and Informed Asymptotically-Optimal

Abstract

1 Introduction

2 Prior Work

3 Problem Setup and Notation

Definition 1 (RFCMP)

Definition 2 (Asymptotic Optimality)

4 Algorithmic Foundations

4.1 Tensor-product roadmap

4.2 Discrete RRT

5 Asymptotically Optimal Discrete RRT

5.1 ao-dRRT{\tt dRRT}dRRT

5.2 dRRT∗{\tt dRRT^{*}}dRRT∗

6 Analysis

6.1 Optimal Convergence of ^G\hat{}\mathbb{G}^G

Definition 3

Definition 4

Definition 5

Theorem 6.1

Claim 1

Proof

Lemma 1

Proof

6.1.1 Optimal Convergence for a Linear combination of Euclidean arc lengths

Lemma 2

Proof

Lemma 3

Proof

6.1.2 Optimal Convergence for Euclidean arc length

Definition 6

Claim 2

Proof

Claim 3

Proof

Lemma 4

Proof

6.2 Asymptotic Optimality of dRRT∗{\tt dRRT^{*}}dRRT∗

Theorem 6.2

Lemma 5 (Optimal Tree Convergence of dRRT∗{\tt dRRT^{*}}dRRT∗)

Proof

Theorem 6.3 (Thm 11.3 in Grinstead & Snell)

7 Extension to Shared Degrees of Freedom

7.1 Problem Setup and Notation

7.2 Methodology

7.3 Analysis

Claim 4

Proof

8 Experimental Validation

8.1 Tests on Systems without Shared DoF

8.2 Tests on Systems with Shared DoF

8.3 Real world experiments

9 Discussion

5.1 ao- ${\tt dRRT}$

5.2 ${\tt dRRT^{*}}$

6.1 Optimal Convergence of $\hat{}\mathbb{G}$

6.2 Asymptotic Optimality of ${\tt dRRT^{*}}$

Lemma 5 (Optimal Tree Convergence of ${\tt dRRT^{*}}$ )