Log Diameter Rounds Algorithms for $2$-Vertex and $2$-Edge Connectivity

Alexandr Andoni; Clifford Stein; Peilin Zhong

arXiv:1905.00850·cs.DS·May 3, 2019

Log Diameter Rounds Algorithms for $2$-Vertex and $2$-Edge Connectivity

Alexandr Andoni, Clifford Stein, Peilin Zhong

PDF

TL;DR

This paper develops efficient parallel algorithms for 2-vertex and 2-edge connectivity problems in the MPC model, improving on previous bounds and establishing lower bounds, with scalability and practical relevance for large distributed systems.

Contribution

It introduces new MPC algorithms for 2-edge and 2-vertex connectivity with improved parallel time bounds based on graph diameter, and provides lower bounds for biconnectivity.

Findings

01

Algorithms run in roughly log diameter rounds.

02

Achieves linear total memory and scalable per-processor memory.

03

Provides a lower bound of Omega(log D') for biconnectivity.

Abstract

Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. In this paper, we study two fundamental problems, $2$ -edge connectivity and $2$ -vertex connectivity (biconnectivity). PRAM algorithms which run in $O (lo g n)$ time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are,…

Equations24

(v, a_{1, 1}, a_{1, 2}, \dots, a_{1, n_{1}}, v, a_{2, 1}, a_{2, 2}, \dots, a_{2, n_{2}}, v, \dots, a_{k, 1}, a_{k, 2}, \dots, a_{k, n_{k}}, v),

(v, a_{1, 1}, a_{1, 2}, \dots, a_{1, n_{1}}, v, a_{2, 1}, a_{2, 2}, \dots, a_{2, n_{2}}, v, \dots, a_{k, 1}, a_{k, 2}, \dots, a_{k, n_{k}}, v),

lev (v) \leftarrow min (dep_{par} (v), w \in V ∖ {par (v)} : (v, w) \in E min dep_{par} (the LCA of (v, w))) .

lev (v) \leftarrow min (dep_{par} (v), w \in V ∖ {par (v)} : (v, w) \in E min dep_{par} (the LCA of (v, w))) .

lev (v) \leftarrow min (dep_{par} (v), w \in V ∖ {par (v)} : (v, w) \in E min dep_{par} (the LCA of (v, w))) .

lev (v) \leftarrow min (dep_{par} (v), w \in V ∖ {par (v)} : (v, w) \in E min dep_{par} (the LCA of (v, w))) .

(u, par^{(1)} (u), par^{(2)} (u), \dots, the LCA of (u, v), \dots, par^{(2)} (v), par^{(1)} (v), v, u) .

(u, par^{(1)} (u), par^{(2)} (u), \dots, the LCA of (u, v), \dots, par^{(2)} (v), par^{(1)} (v), v, u) .

(x, par^{(1)} (x), par^{(2)} (x), \dots, v, u, par (u), \dots, the LCA of (x, y), \dots, par^{(2)} (y), par^{(1)} (y), y, x) .

(x, par^{(1)} (x), par^{(2)} (x), \dots, v, u, par (u), \dots, the LCA of (x, y), \dots, par^{(2)} (y), par^{(1)} (y), y, x) .

(u, par^{(1)} (u), par^{(2)} (u), \dots, the LCA of (u, v), \dots, par^{(2)} (v), par^{(1)} (v), v, u),

(u, par^{(1)} (u), par^{(2)} (u), \dots, the LCA of (u, v), \dots, par^{(2)} (v), par^{(1)} (v), v, u),

(u, par^{(1)} (u), \dots, the LCA of (u, v), \dots, par^{(1)} (v), v, u) .

(u, par^{(1)} (u), \dots, the LCA of (u, v), \dots, par^{(1)} (v), v, u) .

(u, par^{(1)} (u), par^{(2)} (u), \dots, par^{(s)} (u), v, u)

(u, par^{(1)} (u), par^{(2)} (u), \dots, par^{(s)} (u), v, u)

(u, par^{(1)} (u), \dots, par^{(s_{1})} (u), the LCA of (u, v), par^{(s_{2})} (v), \dots, par^{(1)} (v), v, u)

(u, par^{(1)} (u), \dots, par^{(s_{1})} (u), the LCA of (u, v), par^{(s_{2})} (v), \dots, par^{(1)} (v), v, u)

(par^{(s_{1})} (u), par^{(s_{1} - 1)} (u), \dots, par^{(1)} (u), u, v, par^{(1)} (v), par^{(2)} (v), \dots, par^{(s_{2})} (v)) .

(par^{(s_{1})} (u), par^{(s_{1} - 1)} (u), \dots, par^{(1)} (u), u, v, par^{(1)} (v), par^{(2)} (v), \dots, par^{(s_{2})} (v)) .

S (v) = {u \in V ∣ dep_{par} (u) > dep_{par} (v), \exists i \in [t - 1], par^{(i)} (u) = v} .

S (v) = {u \in V ∣ dep_{par} (u) > dep_{par} (v), \exists i \in [t - 1], par^{(i)} (u) = v} .

\forall i \in [⌈ n / t ⌉], a_{i}^{'} \leftarrow j \in [n] : (i - 1) \cdot t < j \leq i \cdot t min a_{j} .

\forall i \in [⌈ n / t ⌉], a_{i}^{'} \leftarrow j \in [n] : (i - 1) \cdot t < j \leq i \cdot t min a_{j} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Columbia [email protected] partly supported by NSF Grants (CCF-1617955 and CCF-1740833), Simons Foundation (#491119) and Google Research Award. Columbia [email protected] partly supported by NSF Grants CCF-1714818 and CCF-1822809. Columbia [email protected] partly supported by NSF Grants (CCF-1703925, CCF-1421161, CCF-1714818, CCF-1617955 and CCF-1740833), Simons Foundation (#491119) and Google Research Award. \CopyrightAlexandr Andoni, Clifford Stein and Peilin Zhong\ccsdesc[300]Theory of computation MapReduce algorithms \ccsdesc[300]Mathematics of computing Paths and connectivity problems\supplement

Acknowledgements.

\hideLIPIcs\EventEditorsChristel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi \EventNoEds4 \EventLongTitle46th International Colloquium on Automata, Languages, and Programming (ICALP 2019) \EventShortTitleICALP 2019 \EventAcronymICALP \EventYear2019 \EventDateJuly 9–12, 2019 \EventLocationPatras, Greece \EventLogoeatcs \SeriesVolume132 \ArticleNo9

Log Diameter Rounds Algorithms for $2$ -Vertex and $2$ -Edge Connectivity

Alexandr Andoni

Clifford Stein

Peilin Zhong

Abstract

Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data — data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model.

In this paper, we study two fundamental problems, $2$ -edge connectivity and $2$ -vertex connectivity (biconnectivity). PRAM algorithms which run in $O(\log n)$ time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an $n$ -vertex, $m$ -edge graph of diameter $D$ and bi-diameter $D^{\prime}$ , 1) a $O(\log D\log\log_{m/n}n)$ parallel time $2$ -edge connectivity algorithm, 2) a $O(\log D\log^{2}\log_{m/n}n+\log D^{\prime}\log\log_{m/n}n)$ parallel time biconnectivity algorithm, where the bi-diameter $D^{\prime}$ is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be $O(n^{\delta})$ for arbitrary constant $\delta>0$ , and the total memory used is linear in the problem size. Our $2$ -edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [4]. We also show an $\Omega(\log D^{\prime})$ conditional lower bound for the biconnectivity problem.

keywords:

parallel algorithms, biconnectivity, $2$ -edge connectivity, the MPC model

category:

\relatedversion

1 Introduction

The success of modern parallel and distributed systems such as MapReduce [16, 17], Spark [42], Hadoop [40], Dryad [24], together with the need to solve problems on massive data, is driving the development of new algorithms which are more efficient and scalable in these large-scale systems. An important theoretical problem is to develop models which are good abstractions of these computational frameworks. The Massively Parallel Computation (MPC) model [26, 22, 11, 3, 9, 15, 4] captures the capabilities of these computational systems while keeping the description of the model itself simple. In the MPC model, there are machines (processors), each with $\Theta(N^{\delta})$ local memory, where $N$ denotes the size of the input and $\delta\in(0,1)$ . The computation proceeds in rounds, where each machine can perform unlimited local computation in a round and exchange $O(N^{\delta})$ data at the end of the round. The parallel time of an algorithm is measured by the total number of computation-communication rounds. The MPC model is a variant of the Bulk Synchronous Parallel (BSP) model [39]. It is also a more powerful model than the PRAM since any PRAM algorithm can be simulated in the MPC model [26, 22] while some problem can be solved in a faster parallel time in the MPC model. For example, computing the XOR of $N$ bits takes $O(1/\delta)$ parallel time in the MPC model but needs near-logarithmic parallel time on the most powerful CRCW PRAM [10].

A natural question to ask is: which problems can be solved in faster parallel time in the MPC model than on a PRAM? This question has been studied by a line of recent papers [26, 19, 30, 3, 1, 6, 23, 15, 7, 14, 13, 33, 20]. Most of these results studied the graph problems, which are the usual benchmarks of parallel/distributed models. Many graph problems such as graph connectivity [36, 34, 31], graph biconnectivity [38, 37], maximal matching [27], minimum spanning tree [28] and maximal independent set [32, 2] can be solved in the standard logarithmic time in the PRAM model, but these problems have been shown to have a better parallel time in the MPC model.

In addition, we hope to develop fully scalable algorithms for the graph problems, i.e., the algorithm should work for any constant $\delta>0$ . The previous literatures show that a graph problem in the MPC model with large local memory size may be much easier than the same problem in the MPC model but with a smaller local memory size. In particular, when the local memory size per machine is close to the number of vertices $n$ , many graph problems have efficient algorithms. For example, if the local memory size per machine is $n/\log^{O(1)}n$ , the connectivity problem [7] and the approximate matching problem [5] can be solved in $O(\log\log n)$ parallel time. If the local memory size per machine is $\Omega(n)$ , then the MPC model meets the congested clique model [12]. In this setting, the connectivity problem and the minimum spanning tree problem can be solved in $O(1)$ parallel time [25]. If the local memory size per machine is $n^{1+\Omega(1)}$ , many graph problems such as maximal matching, approximate weighted matchings, approximate vertex and edge covers, minimum cuts, and the biconnectivity problem can be solved in $O(1)$ parallel time [30, 8]. The landscape of graph algorithms in the MPC model with small local memory is more nuanced and challenging for algorithm designers. If the local memory size per machine is $n^{1-\Omega(1)}$ , then the best connectivity algorithm takes parallel time $O(\log D\log\log n)$ where $D$ is the diameter of the graph [4], and the best approximate maximum matching algorithm takes parallel time $\widetilde{O}(\sqrt{\log n})$ [33].

Therefore, the main open question is: which kind of the graph problems can have faster fully scalable MPC algorithms than the standard logarithmic PRAM algorithms?

Two fundamental graph problems in graph theory are $2$ -edge connectivity and $2$ -vertex connectivity (biconnectivity). In this work, we studied these two problems in the MPC model. Consider an $n$ -vertex, $m$ -edge undirected graph $G$ . A bridge of $G$ is an edge whose removal increases the number of connected components of $G$ . In the $2$ -edge connectivity problem, the goal is to find all the bridges of $G$ . For any two different edges $e,e^{\prime}$ of $G$ , $e,e^{\prime}$ are in the same biconnected component (block) of $G$ if and only if there is a simple cycle which contains both $e,e^{\prime}$ . If we define a relation $R$ such that $eRe^{\prime}$ if and only if $e=e^{\prime}$ or $e,e^{\prime}$ are contained by a simple cycle, then $R$ is an equivalence relation [18]. Thus, a biconnected component is an induced graph of an equivalence class of $R$ . In the biconnectivity problem, the goal is to output all the biconnected components of $G$ . We proposed faster, fully scalable algorithms for the both $2$ -edge connectivity problem and the biconnectivity problem by parameterizing the running time as a function of the diameter and the bi-diameter of the graph. The diameter $D$ of $G$ is the largest diameter of its connected components. The definition of bi-diameter is a natural generalization of the definition of diameter. If vertices $u,v$ are in the same biconnected component, then the cycle length of $(u,v)$ is defined as the minimum length of a simple cycle which contains both $u$ and $v$ . The bi-diameter $D^{\prime}$ of $G$ is the largest cycle length over all the vertex pairs $(u,v)$ where both $u$ and $v$ are in the same biconnected component. Our main results are 1) a fully scalable $O(\log D\log\log_{m/n}n)$ parallel time $2$ -edge connectivity algorithm, 2) a fully scalable $O(\log D\log^{2}\log_{m/n}n+\log D^{\prime}\log\log_{m/n}n)$ parallel time biconnectivity algorithm. Our $2$ -edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [4]. We also show an $\Omega(\log D^{\prime})$ conditional lower bound for the biconnectivity problem.

1.1 The Model

Our model of computation is the Massively Parallel Computation (MPC) model [26, 22, 11].

Consider two non-negative parameters $\gamma\geq 0,\delta>0$ . In the $(\gamma,\delta)$ -MPC model [4], there are $p$ machines (processors) each with local memory size $s$ , where $p\cdot s=\Theta(N^{1+\gamma}),s=\Theta(N^{\delta})$ and $N$ denotes the size of the input data. Thus, the space per machine is sublinear in $N$ , and the total space is only an $O(N^{\gamma})$ factor more than the input size. In particular, if $\gamma=0$ , the total space available in the system is linear in the input size $N$ . The space size is measured by words each containing $\Theta(\log(s\cdot p))$ bits. Before the computation starts, the input data is distributed on $\Theta(N/s)$ input machines. The computation proceeds in rounds. In each round, each machine can perform local computation on its local data, and send messages to other machines at the end of the round. In a round, the total size of messages sent/received by a machine should be bounded by its local memory size $s=\Theta(N^{\delta})$ . For example, a machine can send $s$ size $1$ messages to $s$ machines or send a size $s$ message to $1$ machine in a single round. However, it cannot broadcast a size $s$ message to every machine. In the next round, each machine only holds the received messages in its local memory. At the end of the computation, the output data is distributed on the output machines. An algorithm in this model is called a $(\gamma,\delta)$ -MPC algorithm. The parallel time of an algorithm is the total number of rounds needed to finish its computation. In this paper, we consider $\delta$ an arbitrary constant in $(0,1)$ .

1.2 Our Results

Our main results are efficient MPC algorithms for $2$ -edge connectivity and biconnectivity problems. In our algorithms, one important subroutine is computing the Depth-First-Search (DFS) sequence [4] which is a variant of the Euler tour representation proposed by [38, 37] in 1984. We show how to efficiently compute the DFS sequence in the MPC model with linear total space. Conditioned on the hardness of the connectivity problem in the MPC model, we prove a hardness result on the biconnectivity problem.

For $2$ -edge connectivity and biconnectivity, the input is an undirected graph $G=(V,E)$ with $n=|V|$ vertices and $m=|E|$ edges. $N=n+m$ denotes the size of the representation of $G$ , $D$ denotes the diameter of $G$ , and $D^{\prime}$ denotes the bi-diameter of $G$ . We state our results in the following.

Biconnectivity. In the biconnectivity problem, we want to find all the biconnected components (blocks) of the input graph $G$ . Since the biconnected components of $G$ define a partition on $E$ , we just need to color each edge, i.e., at the end of the computation, $\forall e\in E$ , there is a unique tuple $(x,c)$ with $x=e$ stored on an output machine, where $c$ is called the color of $e$ , such that the edges $e_{1},e_{2}$ are in the same biconnected components if and only if they have the same color.

Theorem 1.1 (Biconnectivity in MPC).

For any $\gamma\in[0,2]$ and any constant $\delta\in(0,1)$ , there is a randomized $(\gamma,\delta)$ -MPC algorithm which outputs all the biconnected components of the graph $G$ in $O\left(\log D\cdot\log^{2}\frac{\log n}{\log(N^{1+\gamma}/n)}+\log D^{\prime}\cdot\log\frac{\log n}{\log(N^{1+\gamma}/n)}\right)$ parallel time. The success probability is at least $0.95$ . If the algorithm fails, then it returns FAIL.

The worst case is when the input graph is sparse and the total space available is linear in the input size, i.e., $N=n+m=O(n)$ and $\gamma=0$ . In this case, the parallel running time of our algorithm is $O(\log D\cdot\log^{2}\log n+\log D^{\prime}\cdot\log\log n)$ . If the graph is slightly denser ( $m=n^{1+c}$ for some constant $c>0$ ), or the total space is slightly larger ( $\gamma>0$ is a constant), then we obtain $O(\log D+\log D^{\prime})$ time.

A cut vertex (articulation point) in the graph $G$ is a vertex whose removal increases the number of connected components of $G$ . Since a vertex $v$ is a cut vertex if and only if there are two edges $e_{1},e_{2}$ which share the endpoint $v$ and $e_{1},e_{2}$ are not in the same biconnected component, our algorithm can also find all the cut vertices of $G$ .

$2$ -Edge connectivity. In the $2$ -edge connectivity problem, we want to output all the bridges of the input graph $G$ . Since an edge is a bridge if and only if each of its endpoints is either a cut vertex or a vertex with degree $1$ , the $2$ -edge connectivity problem should be easier than the biconnectivity problem. We show how to solve $2$ -edge connectivity in the same parallel time as the algorithm proposed by [4] for solving connectivity.

Theorem 1.2 ( $2$ -Edge connectivity in MPC).

For any $\gamma\in[0,2]$ and any constant $\delta\in(0,1)$ , there is a randomized $(\gamma,\delta)$ -MPC algorithm which outputs all the bridges of the graph $G$ in $O\left(\log D\cdot\log\frac{\log n}{\log(N^{1+\gamma}/n)}\right)$ parallel time. The success probability is at least $0.97$ . If the algorithm fails, then it returns FAIL.

DFS sequence. A rooted tree with a vertex set $V$ can be represented by $n=|V|$ pairs $(v_{1},\operatorname{par}(v_{1})),(v_{2},\operatorname{par}(v_{2})),\cdots,(v_{n},\operatorname{par}(v_{n}))$ where $\operatorname{par}:V\rightarrow V$ is a set of parent pointers, i.e., for a non-root vertex $v$ , $\operatorname{par}(v)$ denotes the parent of $v$ , and for the root vertex $v$ , $\operatorname{par}(v)=v$ . We show an algorithm which can compute the DFS sequence (Definition 2.2) of the rooted tree in the MPC model with linear total space.

Theorem 1.3 (DFS sequence of a tree in MPC).

Given a rooted tree represented by a set of parent pointers $\operatorname{par}:V\rightarrow V$ , there is a randomized $(0,\delta)$ -MPC algorithm which outputs the DFS sequence in $O(\log D)$ parallel time, where $\delta\in(0,1)$ is an arbitrary constant, $D$ is the depth of the tree. The success probability is at least $0.99$ . If the algorithm fails, then it returns FAIL.

Conditional hardness for biconnectivity. A conjectured hardness for the connectivity problem is the one cycle vs. two cycles conjecture: for any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , any $(\gamma,\delta)$ -MPC algorithm requires $\Omega(\log n)$ parallel time to determine whether the input $n$ -vertex graph is a single cycle or contains two disjoint length $n/2$ cycles. This conjectured hardness result is widely used in the MPC literature [26, 11, 29, 35, 41]. Under this conjecture, we show that $\Omega(\log D^{\prime})$ parallel time is necessary for the biconnectivity problem, and this is true even when $D=O(1)$ , i.e., the diameter of the graph is a constant.

Theorem 1.4 (Hardness of biconnectivity in MPC).

For any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , unless there is a $(\gamma,\delta)$ -MPC algorithm which can distinguish the following two instances: 1) a single cycle with $n$ vertices, 2) two disjoint cycles each contains $n/2$ vertices, in $o(\log n)$ parallel time, any $(\gamma,\delta)$ -MPC algorithm requires $\Omega(\log D^{\prime})$ parallel time for testing whether a graph $G$ with a constant diameter is biconnected.

1.3 Our Techniques

Biconnectivity. At a high level our biconnectivity algorithm is based on a framework proposed by [37]. The main idea is to construct a new graph and reduce the problem of finding biconnected components of $G$ to the problem of finding connected components of the new graph $G^{\prime}$ . At first glance, it should be efficiently solved by the connectivity algorithm [4]. However, there are two main issues: 1) since the parallel time of the MPC connectivity algorithm of [4] depends on the diameter of the input graph, we need to make the diameter of $G^{\prime}$ small, 2) we need to construct $G^{\prime}$ efficiently. Let us first consider the first issue, and we will discuss the second issue later.

We give an analysis of the diameter of $G^{\prime}=(V^{\prime},E^{\prime})$ constructed by [37]. Without loss of generality, we can suppose the input $G=(V,E)$ is connected. Each vertex in $G^{\prime}$ corresponds to an edge of $G$ . Let $T$ be an arbitrary spanning tree of $G$ with depth $d$ . Each non-tree edge $e$ can define a simple cycle $C_{e}$ which contains the edge $e$ and the unique path between the endpoints of $e$ in the tree $T$ . Thus, the length of $C_{e}$ is at most $2d+1$ . If there is a such cycle containing any two tree edges $(u,v),(v,w)$ , vertices $(u,v),(v,w)$ are connected in $G^{\prime}$ . For each non-tree edge $e$ , we connect the vertex $e$ to the vertex $e^{\prime}$ in graph $G^{\prime}$ where $e^{\prime}$ is an arbitrary tree edge in the cycle $C_{e}$ . By the construction of $G^{\prime}$ , any $e,e^{\prime}$ from the same connected components of $G^{\prime}$ should be in the same biconnected components of $G$ . Now consider arbitrary two edges $e,e^{\prime}$ in the same biconnected component of $G$ . There must be a simple cycle $C$ which contains both edges $e,e^{\prime}$ in $G$ . Since all the simple cycles defined by the non-tree edges are a cycle basis of $G$ [18], the edge set of $C$ can be represented by the xor sum of all the edge sets of $k$ basis cycles $C_{1},C_{2},\cdots,C_{k}$ where $C_{i}$ is a simple cycle defined by a non-tree edge $e_{i}$ on the cycle $C$ . $k$ is upper bounded by the bi-diameter of $G$ . Furthermore, we can assume $C_{i}$ intersects $C_{i+1}$ . There should be a path between $e,e^{\prime}$ in $G^{\prime}$ , and the length of the path is at most $\sum_{i=1}^{k}|C_{i}|\leq O(k\cdot d)$ . So, the diameter of $G^{\prime}$ is upper bounded by $O(k\cdot d)$ . Thus, according to [4], we can find the connected components of $G^{\prime}$ in $\sim(\log k+\log d)$ parallel time, where $d$ and $k$ are upper bounded by the diameter and the bi-diameter of $G$ respectively.

Now let us consider how to construct $G^{\prime}$ efficiently. The bottleneck is to determine whether the tree edges $(u,v),(v,w)$ should be connected in $G^{\prime}$ or not. Suppose $w$ is the parent of $v$ and $v$ is the parent of $u$ . The vertex $(u,v)$ should connect to the vertex $(v,w)$ in $G^{\prime}$ if and only if there is a non-tree edge that connects a vertex $x$ in the subtree of $u$ and a vertex $y$ which is on the outside of the subtree of $v$ . For each vertex $x$ , let $\operatorname{lev}(x)$ be the minimum depth of the least common ancestor (LCA) of $(x,y)$ over all the non-tree edges $(x,y)$ . Then $(u,v)$ should be connected to $(v,w)$ in $G^{\prime}$ if and only if there is a vertex $x$ in the subtree of $u$ in $G$ such that $\operatorname{lev}(x)$ is smaller than the depth of $v$ . Since the vertices in a subtree should appear consecutively in the DFS sequence, this question can be solved by some range queries over the DFS sequence. Next, we will discuss how to compute the DFS sequence of a tree.

DFS sequence. The DFS sequence of a tree is a variant of the Euler tour representation of the tree. For an $n$ -vertex tree $T$ , [37] gives an $O(\log n)$ parallel time PRAM algorithm for the Euler tour representation of $T$ . However, since their construction method will destroy the tree structure, it is hard to get a faster MPC algorithm based on this framework. Instead, we follow the leaf sampling framework proposed by [4]. Although the DFS sequence algorithm proposed by [4] takes $O(\log d)$ time where $d$ is the depth of $T$ , it needs $\Omega(n\log d)$ total space. The bottleneck is the subroutine which needs to solve the least common ancestors problem and generate multiple path sequences. The previous algorithm uses the doubling algorithm for the subroutine, i,e., for each vertex $v$ , they store the $2^{i}$ -th ancestor of $v$ for every $i\in[\lceil\log d\rceil]$ . This is the reason why [4] cannot achieve the linear total space. We show how to compress the tree $T$ into a new tree $T^{\prime}$ which only contains at most $n/\lceil\log d\rceil$ vertices. We argue that applying the doubling algorithm on $T^{\prime}$ is sufficient for us to find the DFS sequence of $T$ .

$2$ -Edge connectivity. Without loss of generality, we can assume the input graph $G$ is connected. Consider a rooted spanning tree $T$ and an edge $e=(u,v)$ in $G$ . Suppose the depth of $u$ is at least the depth of $v$ in $T$ , i.e., $v$ cannot be a child of $u$ . The edge $e$ is not a bridge if and only if either $e$ is a non-tree edge or there is a non-tree edge $(x,y)$ connecting the subtree of $u$ and a vertex on the outside of the subtree of $u$ . Similarly, the second case can be solved by some range queries over the DFS sequence of $T$ .

Conditional hardness for biconnectivity. We want to reduce the connectivity problem to the biconnectivity problem. For an undirected graph $G$ , if we add an additional vertex $v^{*}$ and connects $v^{*}$ to every vertex of $G$ , then the diameter of the resulting graph $G^{\prime}$ is at most $2$ and each biconnected components of $G^{\prime}$ corresponds to a connected component of $G$ . Furthermore, the bi-diameter of $G^{\prime}$ is upper bounded by the diameter of $G$ plus $2$ . Therefore, if the parallel time of an algorithm $\mathcal{A}^{\prime}$ for finding the biconnected components of $G^{\prime}$ depends on the bi-diameter of $G^{\prime}$ , there exists an algorithm $\mathcal{A}$ which can find all the connected components of $G$ in the parallel time which has the same dependence on the diameter of $G$ .

1.4 A Roadmap

The rest of this paper is organized as follows. Section 2 includes the notation and some useful definitions. Section 3 describes the offline algorithms for $2$ -edge connectivity and biconnectivity. It also includes the analysis of some crucial properties and the correctness of the algorithms. In Section 4, we show how to find the DFS sequence of a tree in the MPC model with linear total space. Section 5 discusses the implementations of the $2$ -edge connectivity algorithm and the biconnectivity algorithm in the MPC model. Section 6 contains the conditional hardness result for the biconnectivity problem in the MPC model.

2 Preliminaries

We follow the notation of [4]. $[n]$ denotes the set of integers $\{1,2,\cdots,n\}$ .

Diameter and bi-diameter. Consider an undirected graph $G$ with a vertex set $V$ and an edge set $E$ . For any two vertices $u,v$ , we use $\operatorname{dist}_{G}(u,v)$ to denote the distance between $u$ and $v$ in graph $G$ . If $u,v$ are not in the same (connected) component of $G$ , then $\operatorname{dist}_{G}(u,v)=\infty$ . The diameter $\operatorname{diam}(G)$ of $G$ is the largest diameter of its connected components, i.e., $\operatorname{diam}(G)=\max_{u,v\in V:\operatorname{dist}_{G}(u,v)\not=\infty}\operatorname{dist}_{G}(u,v)$ . $(v_{1},v_{2},\cdots,v_{k})\in V^{k}$ is a cycle of length $k-1$ if $v_{1}=v_{k}$ and $\forall i\in[k-1],(v_{i},v_{i+1})\in E$ . We say a cycle $(v_{1},v_{2},\cdots,v_{k})$ is simple if $k\geq 4$ and each vertex only appears once in the cycle except $v_{1}\ (v_{k})$ . Consider two different vertices $u,v\in V$ . We use $\operatorname{cyclen}_{G}(u,v)$ to denote the minimum length of a simple cycle which contains both vertices $u$ and $v$ . If there is no simple cycle which contains both $u$ and $v$ , $\operatorname{cyclen}_{G}(u,v)=\infty$ . $\operatorname{cyclen}_{G}(u,u)$ is defined as [math]. The bi-diameter of $G$ , $\operatorname{bi-diam}(G)$ , is defined as $\max_{u,v\in V:\operatorname{cyclen}_{G}(u,v)\not=\infty}\operatorname{cyclen}_{G}(u,v)$ .

Representation of a rooted forest. Let $V$ denote a set of vertices. We represent a rooted forest in the same manner as [4]. Consider a mapping $\operatorname{par}:V\rightarrow V$ . For $i\in\mathbb{N}_{>0}$ and $v\in V$ , we define $\operatorname{par}^{(i)}(v)$ as $\operatorname{par}(\operatorname{par}^{(i-1)}(v))$ , and $\operatorname{par}^{(0)}(v)$ is defined as $v$ itself. If $\forall v\in V,\exists i>0$ such that $\operatorname{par}^{(i)}(v)=\operatorname{par}^{(i+1)}(v)$ , then we call $\operatorname{par}$ a set of parent pointers on $V$ . For $v\in V$ , if $\operatorname{par}(v)=v$ , then we say $v$ is a root of $\operatorname{par}$ . Notice that $\operatorname{par}$ actually can represent a rooted forest, thus $\operatorname{par}$ can have more than one root. The depth of $v\in V$ , $\operatorname{dep}_{\operatorname{par}(v)}$ is the smallest $i\in\mathbb{N}$ such that $\operatorname{par}^{(i)}(v)$ is the same as $\operatorname{par}^{(i+1)}(v)$ . The root of $v\in V$ , $\operatorname{par}^{(\infty)}(v)$ is defined as $\operatorname{par}^{(\operatorname{dep}_{\operatorname{par}}(v))}(v)$ . The depth of $\operatorname{par},$ $\operatorname{dep}(\operatorname{par})$ is defined as $\max_{v\in V}\operatorname{dep}_{\operatorname{par}}(v)$ .

Ancestor and path. For two vertices $u,v\in V$ , if $\exists i\in\mathbb{N}$ such that $u=\operatorname{par}^{(i)}(v),$ then $u$ is an ancestor of $v$ (in $\operatorname{par}$ ). If $u$ is an ancestor of $v$ , then the path $P(v,u)$ (in $\operatorname{par}$ ) from $v$ to $u$ is a sequence $(v,\operatorname{par}(v),\operatorname{par}^{(2)}(v),\cdots,u)$ and the path $P(u,v)$ is the reverse of $P(v,u)$ , i.e., $P(u,v)=(u,\cdots,\operatorname{par}^{(2)}(v),\operatorname{par}(v),v)$ . If an ancestor $u$ of $v$ is also an ancestor of $w$ , then $u$ is a common ancestor of $(v,w)$ . Furthermore, if a common ancestor $u$ of $(v,w)$ satisfies $\operatorname{dep}_{\operatorname{par}}(u)\geq\operatorname{dep}_{\operatorname{par}}(x)$ for any common ancestor $x$ of $(v,w)$ , then $u$ is the lowest common ancestor (LCA) of $(v,w)$ .

Children and leaves. For any non-root vertex $u$ of $\operatorname{par}$ , $u$ is a child of $\operatorname{par}(u)$ . For any vertex $v\in V$ , $\operatorname{child}_{\operatorname{par}}(v)$ denotes the set of all the children of $v$ , i.e., $\operatorname{child}_{\operatorname{par}}(v)=\{u\in V\mid u\not=v,\operatorname{par}(u)=v\}.$ If $u$ is the $k^{\text{th}}$ smallest vertex in the set $\operatorname{child}_{\operatorname{par}}(v),$ then we define $\operatorname{rank}_{\operatorname{par}}(u)=k$ , or in other words, $u$ is the $k^{\text{th}}$ child of $v$ . If $v$ is a root vertex of $\operatorname{par}$ , then $\operatorname{rank}_{\operatorname{par}}(v)$ is defined as $1$ . $\operatorname{child}_{\operatorname{par}}(v,k)$ denotes the $k^{\text{th}}$ child of $v$ . For simplicity, if $\operatorname{par}$ is clear in the context, we just use $\operatorname{child}(v)$ , $\operatorname{rank}(v)$ and $\operatorname{child}(v,k)$ to denote $\operatorname{child}_{\operatorname{par}}(v)$ , $\operatorname{rank}_{\operatorname{par}}(v)$ and $\operatorname{child}_{\operatorname{par}}(v,k)$ for short. If $\operatorname{child}(v)=\emptyset$ , then $v$ is a leaf of $\operatorname{par}$ . We denote $\operatorname{leaves}(\operatorname{par})$ as the set of all the leaves of $\operatorname{par}$ , i.e., $\operatorname{leaves}(\operatorname{par})=\{v\mid\operatorname{child}(v)=\emptyset\}$ .

2.1 Depth-First-Search Sequence

The Euler tour representation of a tree is proposed by [38, 37]. It is a crucial building block in many graph algorithms including biconnectivity algorithms. The Depth-First-Search (DFS) sequence [4] of a rooted tree is a variant of the Euler tour representation. Let us first introduce some relevant concepts of the DFS sequence.

Definition 2.1 (Subtree [4]).

Consider a set of parent pointers $\operatorname{par}:V\rightarrow V$ on a vertex set $V$ . Let $v$ be a vertex in $V$ , and let $V^{\prime}=\{u\in V\mid v\text{ is an ancestor of }u\}$ . $\operatorname{par}^{\prime}:V^{\prime}\rightarrow V^{\prime}$ is a set of parent pointers on $V^{\prime}$ . If $\forall u\in V^{\prime}\setminus\{v\}$ , $\operatorname{par}^{\prime}(u)=\operatorname{par}(u)$ and $\operatorname{par}^{\prime}(v)=v$ , then $\operatorname{par}^{\prime}$ is a subtree of $v$ in $\operatorname{par}$ . For $u\in V^{\prime}$ , we say $u$ is in the subtree of $v$ .

The definition of the DFS sequence is the following:

Definition 2.2 (DFS sequence [4]).

Consider a set of parent pointers $\operatorname{par}:V\rightarrow V$ on a vertex set $V$ . Let $v$ be a vertex in $V$ . If $v$ is a leaf in $\operatorname{par}$ , then the DFS sequence of the subtree of $v$ is $(v)$ . Otherwise, the DFS sequence of the subtree of $v$ is defined recursively as

[TABLE]

where $k=|\operatorname{child}(v)|$ and $\forall i\in[k],$ $(a_{i,1},a_{i,2},\cdots,a_{i,n_{i}})$ is the DFS sequence of the subtree of $\operatorname{child}(v,i)$ , i.e., the $i^{\text{th}}$ child of $v$ .

If $\operatorname{par}:V\rightarrow V$ has a unique root $v$ , then we define the DFS sequence of $\operatorname{par}$ as the DFS sequence of the subtree of $v$ . By the definition of the DFS sequence, for any two consecutive elements $a_{i}$ and $a_{i+1}$ in the sequence, $a_{i}$ is either a parent of $a_{i+1}$ or $a_{i}$ is a child of $a_{i+1}$ . Furthermore, for any vertex $v$ , if both elements $a_{i}$ and $a_{j}$ $(i<j)$ in the DFS sequence $A$ are $v$ , any element $a_{k}$ between $a_{i}$ and $a_{j}$ (i.e., $i\leq k\leq j$ ) should be a vertex in the subtree of $v$ .

2.2 Data Organization and Basic Algorithms in the MPC Model

We organize the data in the MPC model as in [4].

Set. Consider a set of $m$ items $S=\{x_{1},x_{2},\cdots,x_{m}\}$ where each $x_{i}$ can be described by a constant number of words. If $x\in S$ $\Leftrightarrow$ there is a unique machine which stores a pair $(``S",x)$ in its local memory, then the set $S$ is stored in the system. $``S"$ is the name of the set $S$ and can be represented by a constant number of words. Let $\mathcal{S}=\{S_{1},S_{2},\cdots,S_{m}\}$ be a family of sets, where $\forall i\in[m],S_{i}$ is stored in the system and the name of $S_{i}$ can be represented by a constant number of words. If $S\in\mathcal{S}$ $\Leftrightarrow$ there is a unique machine which stores a pair $(``\mathcal{S}",``S")$ in its local memory, then we say $\mathcal{S}$ is stored in the system. The total space for storing $S$ is $\Theta(|S|)$ .

An undirected graph $G$ can be represented by a pair of the sets $(V,E)$ , where $V=\{v_{1},v_{2},\cdots,v_{n}\}$ denotes the set of the vertices and $E=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{m},v_{m})\}\subseteq V\times V$ denotes the set of the edges. To store the graph $G$ in the system, we just need to store both $V$ and $E$ in the system.

Mapping. Consider a mapping $f:A\rightarrow B$ where $A,B$ are two finite sets and every element from $A$ or $B$ only requires a constant number of words to describe. Let $S=\{(a,b)\mid a\in A,b=f(a)\}$ . Then $S$ is a set representation of the mapping $f$ , and the name of $S$ is $``f"$ . If the set $S$ is stored in the system, then we say the mapping $f$ is stored in the system. The total space needed for storing $f$ is $\Theta(|A|)$ .

A set of parent pointers on a vertex set $V$ can be regarded as a mapping $\operatorname{par}:V\rightarrow V$ .

Sequence. Let $A=(a_{1},a_{2},\cdots,a_{m})$ be a sequence of $m$ elements, where each element $a_{i}$ can be represented by a constant number of words. Let $S=\{(x_{1},a_{1}),(x_{2},a_{2}),\cdots,(x_{m},a_{m})\}$ where $x_{1}<x_{2}<\cdots<x_{m}\in\mathbb{R}$ . Then $S$ is a set representation of the sequence $A$ , and the name of $S$ is $``A"$ . If $S$ is stored in the system, then we say the sequence $A$ is stored in the system. The total space needed for storing $A$ is $\Theta(m)$ .

Basic MPC operations. One of the most basic algorithm in the MPC model is sorting.

Theorem 2.3 ([21, 22]).

Sorting can be solved in $c/\delta$ parallel time in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ , where $c\geq 0$ is a universal constant.

For any $\delta^{\prime}\geq\delta,$ $O(n^{\delta^{\prime}-\delta})$ number of machines with $\Theta(n^{\delta})$ local memory can always be simulated by $O(1)$ number of machines with $\Theta(n^{\delta^{\prime}})$ local memory. Therefore, if an algorithm can solve a problem in $(\gamma,\delta)$ -MPC model in $R(n)$ rounds, then the such algorithm can be simulated in $(\gamma^{\prime},\delta^{\prime})$ -MPC model in $O(R(n))$ rounds for any $\gamma^{\prime}\geq\gamma,\delta^{\prime}\geq\delta$ . Thus, for any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , sorting takes $O(1)$ parallel time in the $(\gamma,\delta)$ -MPC model.

Sorting is an important tool to build the MPC subroutines. One such MPC subroutine is to handle multiple queries at the same time. Roughly speaking, a random access shared memory can be simulated in the MPC model. Suppose there are $k$ sets $S_{1},S_{2},\cdots,S_{k}$ stored in the system, and the $t$ of them are set representations of mappings $f_{1}:A_{1}\rightarrow B_{1},f_{2}:A_{2}\rightarrow B_{2},\cdots,f_{t}:A_{t}\rightarrow B_{t}$ . Suppose each machine has several queries where each query requires the value $f_{i}(a)$ for some $i\in[t],a\in A_{i}$ . All the queries can be simultaneously handled in constant parallel time in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ . For more basic MPC operations, we refer readers to [4].

3 $2$ -Edge Connectivity and Biconnectivity

Consider a connected undirected graph $G$ with a vertex set $V$ and an edge set $E$ . In the $2$ -edge connectivity problem, the goal is to find all the bridges of $G$ , where an edge $e\in E$ is called a bridge if its removal disconnects $G$ . In the biconnectivity problem, the goal is to partition the edges into several groups $E_{1},E_{2},\cdots,E_{k}$ , i.e., $E=\bigcup_{i=1}^{k}E_{i},\forall i\not=j,E_{i}\cap E_{j}=\emptyset$ , such that $\forall e\not=e^{\prime}\in E$ , $e$ and $e^{\prime}$ are in the same group if and only if there is a simple cycle in $G$ which contains both $e$ and $e^{\prime}$ . A subgraph induced by an edge group $E_{i}$ is called a biconnected component (block). In other words, the goal of the biconnectivity problem is to find all the blocks of $G$ .

In this section, we describe the algorithms for both the $2$ -edge connectivity problem and the biconnectivity problem in the offline setting. In Section 5, we will discuss how to implement them in the MPC model.

3.1 $2$ -Edge Connectivity

The $2$ -edge connectivity problem is much simpler than the biconnectivity problem. We first compute a spanning tree of the graph. Only a tree edge can be a bridge. Then for any non-root vertex $v$ , if there is no non-tree edge which crosses between the subtree of $v$ and the outside of the subtree of $v$ , then the tree edge which connects $v$ to its parent is a bridge.

Lemma 3.1 ( $2$ -Edge connectivity).

Consider an undirected graph $G=(V,E)$ . Let $B$ be the output of Bridges $(G)$ . Then $B$ is the set of all the bridges of $G$ .

Proof 3.2.

Suppose $(u,v)\in E$ is not a bridge. If $(u,v)$ is a non-tree edge in $\operatorname{par}$ , then since $B$ only contains tree edges, $(u,v)\not\in B$ . Otherwise, suppose $\operatorname{par}(v)=u$ . There must be a non-tree edge $(x,y)\in E$ such that $x$ is in the subtree of $v$ but $y$ is not in the subtree of $v$ . Thus, the LCA of $(x,y)$ is not $v$ , and it is an ancestor of $v$ which means that the depth of the LCA of $(x,y)$ is smaller than $\operatorname{dep}_{\operatorname{par}}(v)$ . By step 2, we have $\operatorname{lev}(x)<\operatorname{dep}_{\operatorname{par}}(v)$ . Let $a_{i},a_{j}$ be the first and the last appearance of $v$ in the DFS sequence of $\operatorname{par}$ . Since $x$ is in the subtree of $v$ , there exists $k\in\{i,i+1,\cdots,j\}$ such that $v=a_{k}$ . By step 4, since $\operatorname{lev}(a_{k})<\operatorname{dep}_{\operatorname{par}}(v)$ , $(u,v)\not\in B$ .

If $(u,v)\in E$ is a bridge. Then $(u,v)$ must be a tree edge in $\operatorname{par}$ , i.e., either $\operatorname{par}(u)=v$ or $\operatorname{par}(v)=u$ . Suppose $\operatorname{par}(v)=u$ . Then for any non-tree edge $(x,y)$ with $x$ in the subtree of $v$ , $y$ must also be in the subtree of $v$ . Thus, the depth of the LCA of $(x,y)$ should be at least $\operatorname{dep}_{\operatorname{par}}(v)$ . By step 2, for any $x$ in the subtree of $v$ , we have $\operatorname{lev}(x)\geq\operatorname{dep}_{\operatorname{par}}(v)$ . Let $a_{i},a_{j}$ be the first and the last appearance of $v$ in the DFS sequence of $\operatorname{par}$ . Since all the vertices $a_{i},a_{i+1},\cdots,a_{j}$ are in the subtree of $v$ , we have $(u,v)\in B$ by step 4.

3.2 Biconnectivity

In this section, we will show a biconnectivity algorithm. It is a modification of the algorithm proposed by [37]. The high level idea is to construct a new graph $G^{\prime}$ based on the input graph $G$ , and reduce the biconnectivity problem of $G$ to the connectivity problem of $G^{\prime}$ . Since the running time of the connectivity algorithm [4] depends on the diameter of the graph, we also give an analysis of the diameter of the graph $G^{\prime}$ .

Lemma 3.3 (Biconnectivity).

Consider an undirected graph $G=(V,E)$ . Let $\operatorname{col}:E\rightarrow V$ be the output of Biconn $(G)$ . Then $\forall e,e^{\prime}\in E,e\not=e^{\prime},$ $\operatorname{col}$ satisfies $\operatorname{col}(e)=\operatorname{col}(e^{\prime})$ $\Leftrightarrow$ there is a simple cycle in $G$ which contains both $e$ and $e^{\prime}$ . Furthermore, the diameter of the graph $G^{\prime}$ constructed by Biconn $(G)$ is at most $O(\operatorname{dep}(\operatorname{par})\cdot\operatorname{bi-diam}(G))$ , the number of vertices of $G^{\prime}$ is at most $|V|$ , and the number of edges of $G^{\prime}$ is at most $|E|$ .

Proof 3.4.

Each $v\in V^{\prime}$ corresponds to a tree edge $(\operatorname{par}(v),v)\in E$ . Since $V^{\prime}\subset V$ , $|V^{\prime}|\leq|V|$ . By step 5 and step 6, each edge of $G$ creates at most $1$ edge of $G^{\prime}$ . Thus, $|E^{\prime}|\leq|E|$ .

Claim 1.

If $\operatorname{dist}_{G^{\prime}}(u,v)<\infty$ , i.e., vertices $u,v\in V^{\prime}$ are in the same connected component of $G^{\prime}$ , then there is a simple cycle in $G$ which contains both edges $(u,\operatorname{par}(u))$ and $(v,\operatorname{par}(v))$ .

Proof 3.5.

Firstly, let us consider the case when $(u,v)\in E^{\prime}$ . If $(u,v)$ is added into $E^{\prime}$ by step 6, then there is a simple cycle in $G$ :

[TABLE]

Both edges $(u,\operatorname{par}(u))$ and $(v,\operatorname{par}(v))$ are in the such cycle. If $(u,v)$ is added into $E^{\prime}$ by step 5, then $u=\operatorname{par}(v)$ . Let $a_{i},a_{j}$ be the first and the last appearance of $v$ in $A$ respectively. By step 5, there exists $k$ with $i\leq k\leq j$ such that $\operatorname{lev}(a_{k})<\operatorname{dep}_{\operatorname{par}}(v)$ . Thus, there is a vertex $x$ in the subtree of $v$ such that $\operatorname{lev}(x)<\operatorname{dep}_{\operatorname{par}}(u)$ . By step 2, there is an edge $(x,y)\in E$ such that the depth of the LCA of $(x,y)$ is smaller than $\operatorname{dep}_{\operatorname{par}}(u)$ which means that $y$ is not in the subtree of $u$ . In this case, there is a simple cycle in $G$ :

[TABLE]

Since $u=\operatorname{par}(v)$ , both edges $(v,\operatorname{par}(v))$ , $(u,\operatorname{par}(u))$ are in the such cycle.

Suppose $v,u\in V^{\prime}$ are in the same connected component of $G^{\prime}$ and $(v,\operatorname{par}(v))$ , $(u,\operatorname{par}(u))$ are in a simple cycle $C_{1}$ in $G$ . Suppose $u,w\in V^{\prime}$ are in the same connected component of $G^{\prime}$ and $(u,\operatorname{par}(u))$ , $(w,\operatorname{par}(w))$ are in a simple cycle $C_{2}$ in $G$ . Then, $v$ and $w$ are in the same connected component of $G^{\prime}$ . The symmetric difference of the edge set of $C_{1}$ and the edge set of $C_{2}$ should form another simple cycle $C_{3}$ in $G$ which contains both edges $(v,\operatorname{par}(v))$ and $(w,\operatorname{par}(w))$ . By induction on $\operatorname{dist}_{G^{\prime}}(v,w)$ , the claim holds.

By Claim 1 and step 8, $\forall u,v\in V^{\prime}$ , if $\operatorname{col}((u,\operatorname{par}(u)))=\operatorname{col}((v,\operatorname{par}(v)))$ , then there should be a simple cycle in $G$ which contains both edges $(u,\operatorname{par}(u))$ and $(v,\operatorname{par}(v))$ . Consider an edge $(u,v)\in E$ such that neither $u$ nor $v$ is the LCA of $(u,v)$ , i.e., $(u,v)$ is a non-tree edge. Without loss of generality, suppose $\operatorname{dep}_{\operatorname{par}}(u)\geq\operatorname{dep}_{\operatorname{par}}(v)$ . There is always a cycle in $G$ :

[TABLE]

which contains both edges $(u,v),(u,\operatorname{par}(u))$ . By step 8, we have $\operatorname{col}((u,v))=\operatorname{col}((u,\operatorname{par}(u)))=\operatorname{col}^{\prime}(u)$ . Therefore, $\forall e_{1},e_{2}\in E$ , there are always tree edges $e_{1}^{\prime},e_{2}^{\prime}\in E$ such that $\operatorname{col}(e_{1}^{\prime})=\operatorname{col}(e_{1}),\operatorname{col}(e_{2}^{\prime})=\operatorname{col}(e_{2})$ , $e_{1},e_{1}^{\prime}$ are either in a simple cycle in $G$ or $e_{1}=e_{1}^{\prime}$ , and $e_{2},e_{2}^{\prime}$ are either in a simple cycle in $G$ or $e_{2}=e_{2}^{\prime}$ . If $\operatorname{col}(e_{1})=\operatorname{col}(e_{2})$ , then $\operatorname{col}(e_{1}^{\prime})=\operatorname{col}(e_{2}^{\prime})$ which implies that $e_{1}^{\prime},e_{2}^{\prime}$ are either in a simple cycle in $G$ or $e_{1}^{\prime}=e_{2}^{\prime}$ . Hence if $\operatorname{col}(e_{1})=\operatorname{col}(e_{2}),$ then either there is a simple cycle in $G$ which contains both $e_{1},e_{2}$ or $e_{1}=e_{2}$ .

Next, let us show that if there is a simple cycle in $G$ which contains both edges $e,e^{\prime}\in E$ , then $\operatorname{col}(e)=\operatorname{col}(e^{\prime})$ . An observation is that each non-tree edge $e=(u,v)$ (i.e., neither $u$ nor $v$ is the LCA of $(u,v)$ in $\operatorname{par}$ ) defines a simple cycle $C_{e}$ in $G$ :

[TABLE]

Claim 2.

For any simple cycle $C_{e}$ defined by a non-tree edge $e=(u,v)$ , there is a path $P_{e}$ in $G^{\prime}$ such that $P_{e}$ contains every vertex in $C_{e}$ except the LCA of $(u,v)$ in $\operatorname{par}$ . Furthermore, the length of $P_{e}$ is at most $2\operatorname{dep}(\operatorname{par})$ .

Proof 3.6.

Without loss of generality, we can assume $\operatorname{dep}_{\operatorname{par}}(u)\geq\operatorname{dep}_{\operatorname{par}}(v)$ . If $v$ is an ancestor of $u$ , then the cycle $C_{e}$ is

[TABLE]

for some $s\geq 1$ . For each $j\in[s]$ , $u$ is in the subtree of $\operatorname{par}^{(j-1)}(u)$ . By step 5, since $\operatorname{lev}(u)\leq\operatorname{dep}_{\operatorname{par}}(v)<\operatorname{par}^{(j)}(u)$ for any $j\in[s]$ , we have $(\operatorname{par}^{(j-1)}(u),\operatorname{par}^{(j)}(u))\in E^{\prime}$ . Thus, there is a path $P_{e}$ in $G^{\prime}$ : $(u,\operatorname{par}^{(1)}(u),\operatorname{par}^{(2)}(u),\cdots,\operatorname{par}^{(s)}(u))$ . In this case, the length of $P_{e}$ should be at most $\operatorname{dep}(\operatorname{par})$ .

If $v$ is not an ancestor of $u$ , then the cycle $C_{e}$ is

[TABLE]

for some $s_{1},s_{2}\geq 1$ . By the similar argument, $\forall j\in[s_{1}]$ the edge $(\operatorname{par}^{(j-1)}(u),\operatorname{par}^{(j)}(u))$ ( $\forall j^{\prime}\in[s_{2}]$ the edge $(\operatorname{par}^{(j^{\prime}-1)}(v),\operatorname{par}^{(j^{\prime})}(v))$ ) is added into $E^{\prime}$ by step 5. By step 6, $(u,v)$ is added into $E^{\prime}$ . Therefore, there is a path $P_{e}$ in $G^{\prime}$ :

[TABLE]

In this case, the length of $P_{e}$ should be at most $2\operatorname{dep}(\operatorname{par})-1$ .

Notice that all the simple cycles defined by the non-tree edges formed a cycle basis of the cycle space of $G$ , i.e., the edge set of any simple cycle in $G$ can be represented by an xor sum of the edge sets of cycles $C_{e_{1}},C_{e_{2}},\cdots,C_{e_{s}}$ defined by some non-tree edges $e_{1},e_{2},\cdots,e_{s}\in E$ [18]. Consider any two tree edges $(u,\operatorname{par}(u)),(v,\operatorname{par}(v))\in E$ contained by a simple cycle $C$ . Let $e_{1},e_{2},\cdots,e_{s}\in E$ be all the non-tree edges in $C$ . Then $C$ can be represented by an xor sum of $C_{e_{1}},C_{e_{2}},\cdots,C_{e_{s}}$ . Furthermore, $\forall i\in[s-1],$ $C_{e_{i}}$ and $C_{e_{i+1}}$ should have a common tree edge. According to Claim 2, for each $i\in[s]$ , we can find a path $P_{e_{i}}$ in $G^{\prime}$ and $\forall j\in[s-1]$ , $P_{e_{j}}$ intersects $P_{e_{j+1}}$ . Therefore, $u$ and $v$ are in the same connected component in $G^{\prime}$ . By step 8, $\operatorname{col}((u,\operatorname{par}(u)))=\operatorname{col}^{\prime}(u)=\operatorname{col}^{\prime}(v)=\operatorname{col}((v,\operatorname{par}(v)))$ . Now consider a non-tree edge $e=(u,v)\in E$ . Without loss of generality, we can assume $\operatorname{dep}_{\operatorname{par}}(u)\geq\operatorname{dep}_{\operatorname{par}}(v)$ . A tree edge $(u,\operatorname{par}(u))$ is the simple cycle $C_{e}$ defined by $e$ . By step 8, we know that $\operatorname{col}(e)=\operatorname{col}^{\prime}(u)=\operatorname{col}((u,\operatorname{par}(u)))$ . Therefore, we can conclude that $\forall e_{1},e_{2}\in E$ , if there is a simple cycle in $G$ which contains both $e_{1},e_{2}$ , then $\operatorname{col}(e_{1})=\operatorname{col}(e_{2})$ .

The only thing remaining to prove is the diameter of $G^{\prime}$ . According to Claim 1, $\forall u,v\in V^{\prime}$ with $\operatorname{dist}_{G^{\prime}}(u,v)<\infty$ , there is a cycle $C$ in $G$ which contains both edges $(u,\operatorname{par}(u))$ and $(v,\operatorname{par}(v))$ .

Claim 3.

$\forall u,v\in V^{\prime}$ , if there is a cycle in $G$ which contains both edges $(u,\operatorname{par}(u))$ , $(v,\operatorname{par}(v))$ , then there is a cycle $C$ in $G$ with length $O(\operatorname{bi-diam}(G))$ which contains both edges $(u,\operatorname{par}(u))$ , $(v,\operatorname{par}(v))$ .

Proof 3.7.

By the definition of $\operatorname{bi-diam}(G)$ , there is a cycle $C_{1}$ with length at most $\operatorname{bi-diam}(G)$ which contains both vertices $u,v$ . If $C_{1}$ already contains both edges $(u,\operatorname{par}(u))$ , $(v,\operatorname{par}(v))$ , then we are done. Otherwise, suppose $C_{1}$ does not contain $(u,p(u))$ . There is an another cycle $C_{2}$ with length at most $\operatorname{bi-diam}(G)$ which contains both vertices $\operatorname{par}(u),v$ . We can regard $C_{2}$ as two disjoint paths from $\operatorname{par}(u)$ to $v$ . Thus at least one of the path does not contain the edge $(u,\operatorname{par}(u))$ . Suppose this path is $(\operatorname{par}(u),\cdots,x,\cdots,v)$ where $x$ is the first vertex which appears in $C_{1}$ , then we can combine the path $(u,\operatorname{par}(u),\cdots,x)$ with the path obtained by removing the sub-path from $u$ to $x$ of $C_{1}$ to get a new cycle which contains both the edge $(u,\operatorname{par}(u))$ and $v$ . The length of the new cycle is at most $2\cdot\operatorname{bi-diam}(G)$ . We can do the similar operation to add edge $(v,\operatorname{par}(v))$ into the cycle. Thus, finally we will get a cycle which contains both $(u,\operatorname{par}(u))$ , $(v,\operatorname{par}(v))$ with length at most $3\cdot\operatorname{bi-diam}(G)$ .

According to the above claim, we can find a cycle $C$ in $G$ which contains both edges $(u,\operatorname{par}(u))$ , $(v,\operatorname{par}(v))$ with length at most $O(\operatorname{bi-diam}(G))$ . It means that $C$ can be represented by an xor sum of $s\leq O(\operatorname{bi-diam}(G))$ basis cycles $C_{e_{1}},C_{e_{2}},\cdots,C_{e_{s}}$ defined by non-tree edges $e_{1},e_{2},\cdots,e_{s}$ . Furthermore, $\forall i\in[s-1],$ $C_{i}$ and $C_{i+1}$ have at least one common tree edge. By Claim 2, we can find $s$ paths $P_{e_{1}},P_{e_{2}},\cdots,P_{e_{s}}$ defined by $e_{1},e_{2},\cdots,e_{s}$ in $G^{\prime}$ such that $\forall i\in[s-1],$ $P_{e_{i}}$ intersects $P_{e_{i+1}}$ at some vertex, and $u,v$ are on some path $P_{e_{x}},P_{e_{y}}$ respectively. Thus, $\operatorname{dist}_{G^{\prime}}(u,v)\leq\sum_{i=1}^{s}|P_{e_{i}}|\leq s\cdot O(\operatorname{dep}(\operatorname{par}))\leq O(\operatorname{dep}(\operatorname{par})\cdot\operatorname{bi-diam}(G))$ , where the second inequality follows from Claim 2. To conclude, $\operatorname{diam}(G^{\prime})\leq O(\operatorname{dep}(\operatorname{par})\cdot\operatorname{bi-diam}(G))$ .

4 Parallel DFS Sequence in Linear Total Space

In Section 4.1, we will review an algorithmic framework proposed by [4] for the DFS sequence. In Section 4.2, 4.3, 4.4, we will discuss the subroutines needed for our DFS sequence algorithm in the offline setting. In Section 4.5, we will discuss the implementation in the MPC model.

4.1 DFS Sequence via Leaf Sampling

In the following, we review the leaf sampling algorithmic framework proposed by [4] for finding the DFS sequence of a rooted tree.

Theorem 4.1 (Leaf sampling algorithm [4]).

Consider a set of parent pointers $\operatorname{par}:V\rightarrow V$ on a set $V$ of $n$ vertices. Suppose $\operatorname{par}$ has a unique root. For any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , if both of step 4 and step 6 in LeafSampling $(n^{\delta},\operatorname{par})$ can be implemented in the $(\gamma,\delta)$ -MPC model with $O(\log(\operatorname{dep}(\operatorname{par})))$ parallel time, then the leaf sampling algorithm with parameter $s=n^{\delta}$ on input $\operatorname{par}:V\rightarrow V$ can be implemented in the $(\gamma,\delta)$ -MPC model. Furthermore, with probability at least $0.99$ , LeafSampling $(n^{\delta},\operatorname{par})$ can output the DFS sequence of $\operatorname{par}$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ parallel time. If the algorithm fails, then it returns FAIL.

By Theorem 4.1, we only need to give a linear total space MPC algorithm for the LCA problem and the path generation problem to design an efficient DFS sequence algorithm in the $(0,\delta)$ -MPC model.

In [4], they proposed to use doubling algorithms to compute the LCA and generate the paths. Since they need to store the every $2^{i}$ -th ancestor for each vertex, the total space needed is $\Theta(n\cdot\log(\text{the depth of the tree}))$ . We will show that we only need to apply the doubling algorithm for a compressed tree, instead of applying the doubling algorithm for the original tree.

4.2 Compressed Rooted Tree

Given a set of parent pointers $\operatorname{par}:V\rightarrow V$ , we will show how to compress the rooted tree represented by $\operatorname{par}$ .

Lemma 4.2 (Properties of a compressed rooted tree).

Let $\operatorname{par}:V\rightarrow V$ be a set of parent pointers on a vertex set $V$ with $|V|>1$ , and $\operatorname{par}$ has a unique root. Let $t=\lceil\log(\operatorname{dep}(\operatorname{par}))\rceil$ and let $(V^{\prime},\operatorname{par}^{\prime})=$ Compress $(\operatorname{par})$ . Then it has the following properties:

$|V^{\prime}|\leq|V|/\log(\operatorname{dep}(\operatorname{par}))$ . 2. 2.

$\forall v\in V^{\prime},i\in\mathbb{N},$ * $\operatorname{par}^{\prime(i)}(v)=\operatorname{par}^{(i\cdot t)}(v)\in V^{\prime}$ .* 3. 3.

$\forall v\in V,$ * $\exists i\in\{0,1,\cdots,2t\},$ such that $\operatorname{par}^{(i)}(v)\in V^{\prime}$ .*

Proof 4.3.

Consider the first property. For each $v\in V^{\prime}$ , we define a set

[TABLE]

$\forall u\in S(v),$ * we have $\operatorname{dep}_{\operatorname{par}}(u)-\operatorname{dep}_{\operatorname{par}}(v)<t$ . Since $\forall v\in V^{\prime}$ , $\operatorname{dep}_{\operatorname{par}}(v)\mod t=0$ , we have $S(v)\cap V^{\prime}=\emptyset$ . Furthermore, it is easy to show that $\forall u\not=v\in V^{\prime}$ , $S(u)\cap S(v)=\emptyset$ . Thus, $|V^{\prime}|+\sum_{v\in V^{\prime}}|S(v)|=\sum_{v\in V^{\prime}}(|S(v)|+1)\leq|V|$ . On the other hand, since $\forall v\in V^{\prime}$ , $\operatorname{dep}_{\operatorname{par}}(v)+t\leq\operatorname{dep}(\operatorname{par}),$ we know that $|S(v)|\geq t-1$ . Therefore $\sum_{v\in V^{\prime}}(|S(v)|+1)\geq|V^{\prime}|\cdot t$ . To conclude, $|V^{\prime}|\leq|V|/t\leq|V|/\log(\operatorname{dep}(\operatorname{par}))$ .*

Consider the second property. If $v$ is a root vertex, $\operatorname{par}^{\prime}(v)=\operatorname{par}^{(t)}(v)=v\in V^{\prime}$ . For a non-root vertex $v\in V^{\prime}$ , $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(t)}(v))=\operatorname{dep}_{\operatorname{par}}(v)-t$ . Since $\operatorname{dep}_{\operatorname{par}}(v)\mod t=0$ , we have $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(t)}(v))\mod t=0$ which means that $\operatorname{par}^{\prime}(v)=\operatorname{par}^{(t)}(v)\in V^{\prime}$ . Now we prove by induction. Suppose $\operatorname{par}^{\prime(i-1)}(v)=\operatorname{par}^{((i-1)\cdot t)}(v)$ , then $\operatorname{par}^{\prime(i)}(v)=\operatorname{par}^{\prime}(\operatorname{par}^{\prime(i-1)}(v))=\operatorname{par}^{(t)}(\operatorname{par}^{((i-1)\cdot t)}(v))=\operatorname{par}^{(i\cdot t)}(v)$ .

Consider the third property. For $v\in V$ , $\exists j\in\{0,1,\cdots,t-1\},$ such that $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(j)}(v))\mod t=0$ . Since $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(j+t)}(v))\mod t=0$ and $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(j+t)}(v))+t\leq\operatorname{dep}(\operatorname{par})$ , we know that $\operatorname{par}^{(j+t)}(v)\in V^{\prime}$ . Since $j+t\leq 2t$ , the property holds.

4.3 Least Common Ancestor

Given a rooted tree represented by a set of parent pointers $\operatorname{par}:V\rightarrow V$ on a vertex set $V$ , and a set of $q$ queries $Q=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{q},v_{q})\}$ where $\forall i\in[q],u_{i}\not=v_{i},u_{i},v_{i}\in\operatorname{leaves}(\operatorname{par})$ , we show a space efficient algorithm which can output the LCA of each queried pair of vertices. Notice that the assumption that queries only contain leaves is without loss of generality: we can attach an additional child vertex $v$ to each non-leaf vertex $u$ . Thus, $v$ is a leaf vertex. When a query contains $u$ , we can use $v$ to replace $u$ in the query, and the result will not change.

Before we analyze the algorithm LCA $(\operatorname{par},Q)$ , let us discuss some details of the algorithm.

We pre-compute $\operatorname{dep}_{\operatorname{par}}(v)$ and $\operatorname{dep}_{\operatorname{par}^{\prime}}(u)$ for every $v\in V$ and $u\in V^{\prime}$ . 2. 2.

To implement step 3a, we firstly check whether $\operatorname{dep}_{\operatorname{par}}(u_{i})>\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ . If it is not true, we can set $\widehat{u}_{i}$ to be $u_{i}$ directly. Otherwise, according to Lemma 4.2, there is a $j\in\{0,1,\cdots,2t\}$ such that $\operatorname{par}^{(j)}(u_{i})\in V^{\prime}$ . Since $\operatorname{dep}_{\operatorname{par}}(u_{i})>\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ , $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{(j)}(u_{i}))>\operatorname{dep}_{\operatorname{par}}(v_{i})$ . We initialize $\widehat{u}_{i}$ to be $\operatorname{par}^{(j)}(u_{i})\in V^{\prime}$ . For $k=t\rightarrow 0$ , if $\operatorname{dep}_{\operatorname{par}}(g_{k}(\widehat{u}_{i}))>\operatorname{dep}_{\operatorname{par}}(v_{i})$ (i.e., $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{\prime(2^{k})}(\widehat{u}_{i}))>\operatorname{dep}_{\operatorname{par}}(v_{i})$ ), we set $\widehat{u}_{i}\leftarrow g_{k}(\widehat{u}_{i})=\operatorname{par}^{\prime(2^{k})}(\widehat{u}_{i})$ . Due to Lemma 4.2 again, the final $\widehat{u}_{i}$ must satisfy $\operatorname{dep}_{\operatorname{par}}(\widehat{u}_{i})\geq\operatorname{dep}_{\operatorname{par}}(v_{i})$ and $\operatorname{dep}_{\operatorname{par}}(\widehat{u}_{i})\leq\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ . This step takes time $O(t)$ .

Lemma 4.4 (LCA algorithm).

Let $\operatorname{par}:V\rightarrow V$ be a set of parent pointers on a vertex set $V$ . $\operatorname{par}$ has a unique root. Let $Q=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{q},v_{q})\}$ be a set of $q$ pairs of vertices where $\forall i\in[q],u_{i}\not=v_{i},u_{i},v_{i}\in\operatorname{leaves}(\operatorname{par})$ . Let $\operatorname{lca}:Q\rightarrow V\times V\times V$ be the output of LCA $(\operatorname{par},Q)$ . For $(u_{i},v_{i})\in Q$ , $(p_{i},p_{i,u_{i}},p_{i,v_{i}})=\operatorname{lca}(u_{i},v_{i})$ satisfies that $p_{i}$ is the LCA of $(u_{i},v_{i})$ , $p_{i,u_{i}},p_{i,v_{i}}$ are ancestors of $u_{i},v_{i}$ respectively, and $p_{i,u_{i}},p_{i,v_{i}}$ are children of $p_{i}$ .

Proof 4.5.

Without loss of generality, we can assume $\operatorname{dep}_{\operatorname{par}}(u_{i})\geq\operatorname{dep}_{\operatorname{par}}(v_{i})$ . After step 3a, $\widehat{u}_{i}$ satisfies $\operatorname{dep}_{\operatorname{par}}(\widehat{u}_{i})\geq\operatorname{dep}_{\operatorname{par}}(v_{i})$ and $\operatorname{dep}_{\operatorname{par}}(\widehat{u}_{i})\leq\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ . Notice that the LCA of $(u_{i},v_{i})$ in $\operatorname{par}$ is the same as the LCA of $(\widehat{u}_{i},v_{i})$ in $\operatorname{par}$ . In step 3b, if we find the LCA of $(\widehat{u}_{i},v_{i})$ , then the lemma holds for $\operatorname{lca}(u_{i},v_{i})$ . Otherwise, the depth of the LCA of $(\widehat{u}_{i},v_{i})$ is smaller than $\operatorname{dep}_{\operatorname{par}}(\widehat{u}_{i})-4t\leq\operatorname{dep}_{\operatorname{par}}(v_{i})-2t$ . By combining with Lemma 4.2, neither of $u^{\prime}_{i}$ nor $v^{\prime}_{i}$ in step 3c can be the LCA of $(\widehat{u}_{i},v_{i})$ in $\operatorname{par}$ . Thus, the LCA of $(u_{i},v_{i})$ in $\operatorname{par}$ is the same as the LCA of $(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}$ . According to step 3d, $u^{\prime\prime}_{i},v^{\prime\prime}_{i}$ are ancestors of $u^{\prime}_{i},v^{\prime}_{i}$ respectively in both $\operatorname{par}$ and $\operatorname{par}^{\prime}$ , but neither of $u^{\prime\prime}_{i}$ nor $v^{\prime\prime}_{i}$ is the common ancestor of $(u^{\prime}_{i},v^{\prime}_{i})$ . Furthermore, $\operatorname{par}^{\prime}(u^{\prime\prime}_{i})=\operatorname{par}^{\prime}(v^{\prime\prime}_{i})$ is the LCA of $u^{\prime}_{i},v^{\prime}_{i}$ in $\operatorname{par}^{\prime}$ . Thus, $\operatorname{par}^{\prime}(u^{\prime\prime}_{i})$ is a common ancestor of $(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}$ . By combining with Lemma 4.2, we know that there exists $j\in[2t]$ such that $\operatorname{par}^{(j)}(u^{\prime\prime}_{i})$ is the LCA of $(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}$ . In step 3e, we can find the LCA of $(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}$ and thus the LCA of $(u_{i},v_{i})$ .

4.4 Multi-Paths Generation

Consider a rooted tree represented by a set of parent pointers $\operatorname{par}:V\rightarrow V$ on a vertex set $V$ and a set of $q$ vertex-ancestor pairs $Q=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{q},v_{q})\}$ where $\forall i\in[q],$ $v_{i}$ is an ancestor of $u_{i}$ . We show a space efficient algorithm MultiPaths $(\operatorname{par},Q)$ which can generate all the paths $P(u_{1},v_{1}),P(u_{2},v_{2}),\cdots,P(u_{q},v_{q})$ .

Before we analyze the correctness of the algorithm, let us discuss some details.

In step 3a, if the length of the path is at most $2t$ , then we can generate the path in $O(t)$ rounds. In the $j$ -th round, we can find the vertex $\operatorname{par}^{(j)}(u_{i})=\operatorname{par}(\operatorname{par}^{(j-1)}(u_{i}))$ . 2. 2.

In step 3b, we use the following way to find $v^{\prime}_{i}$ . We initialize $v^{\prime}_{i}$ as $u^{\prime}_{i}$ . For $k=t\rightarrow 0$ , if $\operatorname{dep}_{\operatorname{par}}(g_{k}(v^{\prime}_{i}))>\operatorname{dep}_{\operatorname{par}}(v_{i})$ (i.e., $\operatorname{dep}_{\operatorname{par}}(\operatorname{par}^{\prime(2^{k})}(v^{\prime}_{i}))>\operatorname{dep}_{\operatorname{par}}(v_{i})$ ), we set $v^{\prime}_{i}\leftarrow g_{k}(v^{\prime}_{i})=\operatorname{par}^{\prime(2^{k})}(v^{\prime}_{i})$ .

Lemma 4.6 (Generation of multiple paths).

Let $\operatorname{par}:V\rightarrow V$ be a set of parent pointers on a vertex set $V$ . $\operatorname{par}$ has a unique root. Let $Q=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{q},v_{q})\}\subseteq V\times V$ be a set of pairs of vertices where $\forall j\in[q],$ $v_{j}$ is an ancestor of $u_{j}$ in $\operatorname{par}$ . Let $P_{1},P_{2},\cdots,P_{q}$ be the output of MultiPaths $(\operatorname{par},Q)$ . Then $\forall j\in[q],P_{j}=P(u_{j},v_{j})$ , i.e., $P_{j}$ is a sequence which denotes a path from $u_{j}$ to $v_{j}$ in $\operatorname{par}$ .

Proof 4.7.

Consider a pair $(u_{i},v_{i})\in Q$ . If $\operatorname{dep}_{\operatorname{par}}(u_{i})-\operatorname{dep}_{\operatorname{par}}(v_{i})\leq 2t$ , then $P_{i}$ will be the path from $u_{i}$ to $v_{i}$ in $\operatorname{par}$ by step 3a.

We only need to consider the case when $\operatorname{dep}_{\operatorname{par}}(u_{i})>\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ . According to Lemma 4.2, $\exists j\in[2t]$ such that $\operatorname{par}^{(j)}(u_{i})\in V^{\prime}$ . Thus, $u^{\prime}_{i}\in V^{\prime}$ can be found by step 3b. Then $v^{\prime}_{i}$ can be found. $v_{i}$ is an ancestor of $v^{\prime}_{i}$ . $v^{\prime}_{i}$ is an ancestor of $u^{\prime}_{i}$ . $u^{\prime}_{i}$ is an ancestor of $u_{i}$ . In step 3d, the initialization of $A$ should be $(u_{i},u^{\prime}_{i},\operatorname{par}^{\prime(1)}(u^{\prime}_{i}),\operatorname{par}^{\prime(2)}(u^{\prime}_{i}),\cdots,v^{\prime}_{i},v_{i}).$ By Lemma 4.2, the initialization of $A$ is also $(u_{i},u^{\prime}_{i},\operatorname{par}^{(t)}(u^{\prime}_{i}),\operatorname{par}^{(2t)}(u^{\prime}_{i}),\cdots,v^{\prime}_{i},v_{i})$ . Then by step 3e, the final sequence $P_{i}=A$ will be $(u_{i},\operatorname{par}^{(1)}(u_{i}),\operatorname{par}^{(2)}(u_{i}),\cdots,v_{i})$ which denotes the path from $u_{i}$ to $v_{i}$ in $\operatorname{par}$ .

4.5 Implementation of the DFS Sequence Algorithm in MPC

Here, we discuss how to implement the subroutines mentioned in Section 4.2, 4.3, 4.4 in the MPC model. See section 2.2 for the organization of the data in the MPC model and basic MPC operations.

Compressed rooted tree. Consider the implementation of Compress $(\operatorname{par}:V\rightarrow V)$ (Section 4.2) in the MPC model. The input size is $|V|=n$ . In the first step, we need to compute the depth of every vertex in $\operatorname{par}$ . As shown by [4], this can be computed in the MPC model with $O(n)$ total space and $\Theta(n^{\delta})$ local memory size per machine for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time. In the next step, $V^{\prime}$ can be computed in $O(1)$ time. Finally, we can simultaneously compute $\operatorname{par}^{\prime}(v)$ for every vertex $v\in V^{\prime}$ . Since $\operatorname{par}^{\prime}(v)=\operatorname{par}^{(t)}(v)$ for $t=\lceil\log(\operatorname{dep}(\operatorname{par}))\rceil$ , it takes $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. Therefore, Compress $(\operatorname{par})$ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time.

Least common ancestor. Consider the implementation of LCA $(\operatorname{par}:V\rightarrow V,Q)$ (Section 4.3) in the MPC model. The input size is $|V|+|Q|=n+q$ . The first step computes a compressed rooted tree $\operatorname{par}^{\prime}:V^{\prime}\rightarrow V^{\prime}$ . As discussed in the previous paragraph, this only requires $O(n)$ total space and $\Theta(n^{\delta})$ local memory per machine for any constant $\delta\in(0,1)$ . Before the next step, we need to compute the depth of each vertex in $\operatorname{par}$ and the depth of each vertex in $\operatorname{par}^{\prime}$ . Since $\operatorname{dep}(\operatorname{par}^{\prime})\leq\operatorname{dep}(\operatorname{par})$ , it takes $O(\log(\operatorname{dep}(\operatorname{par})))$ time. In step 2, as shown in [4], $g_{0}(\cdot)\equiv\operatorname{par}^{\prime(2^{0})}(\cdot),g_{1}\equiv\operatorname{par}^{\prime(2^{1})}(\cdot),\cdots,g_{t}\equiv\operatorname{par}^{\prime(2^{t})}(\cdot):V^{\prime}\rightarrow V^{\prime}$ for $t=\lceil\log(\operatorname{dep}(\operatorname{par}))\rceil$ can be computed in the MPC model with $O(|V^{\prime}|\log(\operatorname{dep}(\operatorname{par}^{\prime})))$ total space and $O(|V^{\prime}|^{\delta})$ local memory per machine for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par}^{\prime})))=O(\log(\operatorname{dep}(\operatorname{par})))$ time. According to Lemma 4.2, $|V^{\prime}|\leq|V|/\log(\operatorname{dep}(\operatorname{par}))$ . Thus, step 2 only needs $O(n)$ total space and takes time $O(\log(\operatorname{dep}(\operatorname{par})))$ . For step 3, we can handle all the queries in $Q$ simultaneously. For step 3a, we can use $O(1)$ time to check whether $\operatorname{dep}_{\operatorname{par}}(u_{i})>\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ . If it is true, we can use $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time to find a $j\in\{0,1,\cdots,2t\}$ such that $\operatorname{par}^{(j)}(u_{i})\in V^{\prime}$ . Then, we apply an exponential search by using $g_{0},g_{1},\cdots,g_{t}$ to find $\widehat{u}_{i}$ . This takes $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. Step 3b checks whether $\operatorname{par}^{(j)}(\widehat{u}_{i})$ is the LCA for every $j\in[4t]$ . Thus, it takes $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. In step 3c, according to Lemma 4.2, there exists $j\in\{0,1,2,\cdots,2t\}$ such that $\operatorname{par}^{(j)}(\widehat{u}_{i})\in V^{\prime}$ . Thus, we only need time $O(t)$ to find $u^{\prime}_{i}$ . Similarly, we only need time $O(t)$ to find $v^{\prime}_{i}$ . In step 3d, by [4], the LCA of each $(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}^{\prime}$ can be computed simultaneously in the MPC model with $O(|V^{\prime}|\log|V^{\prime}|+|Q|)=O(n)$ total space in $O(\log(\operatorname{dep}(\operatorname{par}^{\prime})))=O(\log(\operatorname{dep}(\operatorname{par})))$ time. The last step checks whether $\operatorname{par}^{(j)}(u^{\prime\prime}_{i})=\operatorname{par}^{(j)}(v^{\prime\prime}_{i})$ for each $j\in[2t]$ . Thus it requires $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. To conclude, LCA $(\operatorname{par}:V\rightarrow V,Q)$ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ parallel time.

Multiple paths generation. Consider the implementation of MultiPaths $(\operatorname{par}:V\rightarrow V,Q)$ (Section 4.4) in the MPC model. The first two steps are the same as the first two in the LCA subroutine mentioned in the previous paragraph. They can be implemented in the MPC model with $O(|V|)=O(n)$ total space and $\Theta(n^{\delta})$ local memory per machine for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time. We compute the depth of each vertex in $\operatorname{par}$ and the depth of each vertex in $\operatorname{par}^{\prime}$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time before the next step. In step 3, all the queries $(u_{i},v_{i})\in Q$ can be handled simultaneously. In step 3a, if $\operatorname{dep}_{\operatorname{par}}(u_{i})\leq\operatorname{dep}_{\operatorname{par}}(v_{i})+2t$ , the length of the path from $u_{i}$ to $v_{i}$ is at most $2t$ , and thus $P(u_{i},v_{i})$ can be computed in $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. In step 3b, we can use $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time to find the minimum $j\in[2t]$ such that $\operatorname{par}^{(j)}(u_{i})\in V^{\prime}$ . Then we can apply exponential search to find $v^{\prime}_{i}$ by using $g_{0},g_{1},\cdots,g_{t}$ in $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ time. In step 3c, by [4], each path $P^{\prime}(u^{\prime}_{i},v^{\prime}_{i})$ in $\operatorname{par}^{\prime}$ can be generated simultaneously in the MPC model with $O(|V^{\prime}|\log|V^{\prime}|+\sum_{i\in[q]}|P^{\prime}(u^{\prime}_{i},v^{\prime}_{i})|)=O(n+\sum_{i\in[q]}|P(u_{i},v_{i})|)$ total space in $O(\log(\operatorname{dep}(\operatorname{par}^{\prime})))=O(\log(\operatorname{dep}(\operatorname{par})))$ time. Consider the initialization of $A=(a_{1},a_{2},\cdots,a_{h})$ in step 3d. $a_{1}$ should be $u_{i}$ and $a_{h}$ should be $v_{i}$ . By Lemma 4.2, $\forall j\in[h-1]$ , $\operatorname{dep}(a_{j})-\operatorname{dep}(a_{j+1})\leq 2t$ . Thus, the number of repetitions in the final step is at most $O(t)=O(\log(\operatorname{dep}(\operatorname{par})))$ . To conclude, MultiPaths $(\operatorname{par}:V\rightarrow V,Q=\{(u_{1},v_{1}),(u_{2},v_{2}),\cdots,(u_{q},v_{q})\})$ can be implemented in the MPC model with total space linear in $O(|V|+\sum_{i\in[q]}|P(u_{i},v_{i})|)$ and local memory size $\Theta(|V|^{\delta})$ per machine for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time.

DFS sequence in the MPC model. Consider LeafSampling $(n^{\delta},\operatorname{par}:V\rightarrow V)$ where $n=|V|$ and $\delta$ is an arbitrary constant from $(0,1)$ . For step 4 of LeafSampling $(n^{\delta},\operatorname{par})$ , we run our LCA (Section 4.3) algorithm. The correctness of our LCA algorithm is guaranteed by Lemma 4.4. According to [4], the total number of queries generated in step 4 of LeafSampling $(n^{\delta},\operatorname{par})$ is at most $O(n^{\delta})$ with high probability. Then due to the discussion in the previous paragraphs, the step 4 of LeafSampling $(n^{\delta},\operatorname{par})$ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time. For step 6 of LeafSampling $(n^{\delta},\operatorname{par})$ , we run our multiple paths generation (Section 4.4) algorithm. The correctness of our multiple paths generation algorithm is guaranteed by Lemma 4.6. Notice that the total length of all the queried paths in the step 6 of LeafSampling $(n^{\delta},\operatorname{par})$ is at most the length of the DFS sequence which is $O(n)$ . According to the discussion in the previous paragraphs, the step 6 of LeafSampling $(n^{\delta},\operatorname{par})$ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ in $O(\log(\operatorname{dep}(\operatorname{par})))$ time. Together with Theorem 4.1, we conclude Theorem 1.3.

5 $2$ -Edge Connectivity and Biconnectivity in MPC

In this section, we will discuss how to implement the $2$ -edge connectivity algorithm and the biconnectivity algorithm in the MPC model. Let us firstly introduce how to implement an subroutine called range minimum query (RMQ) in the MPC model.

5.1 Parallel Range Minimum Query in Linear Total Space

The range minimum query (RMQ) problem is as the following. Given a sequence $A=(a_{1},a_{2},\cdots,a_{n})$ and a set of queries $Q=\{(l_{1},r_{1}),(l_{2},r_{2}),\cdots,(l_{q},r_{q})\}$ where $\forall i\in[q],l_{i}\leq r_{i}\in[n]$ , we want to find the value $\min_{l_{i}\leq j\leq r_{i}}a_{j}$ for each query $(l_{i},r_{i})\in Q$ . [4] shows an MPC algorithm which requires total space $O(n\log n+q)$ and takes $O(1)$ parallel time for solving the RMQ problem. Their space is not linear in the input size. In this section, we show that if every query $(l_{i},r_{i})\in Q$ satisfies $r_{i}-l_{i}\geq 2\lceil\log n\rceil$ , then we can solve the such RMQ problem in the MPC model with total space $O(n+q)$ in $O(1)$ parallel time. The offline description is shown in the algorithm RMQ $(A,Q)$ .

Lemma 5.1 (Range minimum query).

Let $A=(a_{1},a_{2},\cdots,a_{n})\in\mathbb{Z}^{n}$ be a sequence of $n$ numbers and $Q=\{(l_{1},r_{1}),(l_{2},r_{2}),\cdots,(l_{q},r_{q})\}$ where $\forall i\in[q],l_{i},r_{i}\in[n],l_{i}+\lceil\log n\rceil\leq r_{i}$ . Let $\operatorname{rmq}:Q\rightarrow\mathbb{Z}$ be the output of RMQ $(A,Q)$ . Then $\forall(l_{i},r_{i})\in Q$ , $\operatorname{rmq}((l_{i},r_{i}))=\min_{j\in[n]\cap[l_{i},r_{i}]}a_{j}.$ In addition, RMQ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ in $O(1)$ parallel time.

Proof 5.2.

Firstly, let us consider the correctness of RMQ $(A,Q)$ . Let $t=\lceil\log n\rceil$ . For a query $(l_{i},r_{i})\in Q$ , since $l_{i}+t\leq r_{i}$ , the $l^{\prime}_{i},r^{\prime}_{i}$ found by the step 3a will satisfy $l^{\prime}_{i}\leq r^{\prime}_{i}$ . If $l^{\prime}_{i}=r^{\prime}_{i}$ , then $m_{i}=\infty$ and $\operatorname{rmq}((l_{i},r_{i}))=\min(\min_{l_{i}\leq j\leq l_{i}^{\prime}}a_{j},\min_{l_{i}^{\prime}\leq j\leq r_{i}}a_{j})=\min_{l_{i}\leq j\leq r_{i}}a_{j}$ . Otherwise, by step 3b, $m_{i}=\min_{l_{i}^{\prime}+1\leq j\leq r_{i}^{\prime}}a_{j}$ . By step 3c, $\operatorname{rmq}((l_{i},r_{i}))=\min(\min_{l_{i}\leq j\leq l_{i}^{\prime}}a_{j},\min_{l_{i}^{\prime}+1\leq j\leq r_{i}^{\prime}}a_{j},\min_{r_{i}^{\prime}\leq j\leq r_{i}}a_{j})=\min_{l_{i}\leq j\leq r_{i}}a_{j}$ .

Let us analyze the total space required and the parallel time for running RMQ $(A,Q)$ in the MPC model. According to Theorem 2.3, the sorting takes $O(1)$ time and requires linear total space. Notice that $\delta\in(0,1)$ is a constant and each machine has $\Theta(n^{\delta})$ local memory. We can sort $a_{1},a_{2},\cdots,a_{n}$ by their indexes and $o(n)$ number of duplicates of some elements in $A$ such that $a_{i\cdot n^{\delta}+1},\cdots,a_{(i+1)\cdot n^{\delta}},a_{(i+1)\cdot n^{\delta}+1},\cdots,a_{(i+1)\cdot n^{\delta}+t}$ are on the $i^{\text{th}}$ machine. Therefore, the first two steps of RMQ $(A,Q)$ can be implemented in the MPC model with $O(n)$ total space and in time $O(1)$ . For step 3, we can handle all the queries $(l_{i},r_{i})\in Q$ simultaneously. Step 3a only requires local computations. Step 3b needs to handle at most $|Q|$ RMQ on the sequence $A^{\prime}$ . Due to [4], this can be implemented in the MPC model with $O(|A^{\prime}|\log|A^{\prime}|+|Q|)=O(n+q)$ total space and $O(1)$ parallel time. Step 3c can be done in $O(1)$ time. To conclude, RMQ $(A,Q)$ can be implemented in the $(0,\delta)$ -MPC model for any constant $\delta\in(0,1)$ and the parallel time is $O(1)$ .

5.2 MPC Implementation of $2$ -Edge Connectivity and Biconnectivity

The input is a connected undirected graph $G=(V,E)$ . $G$ has $|V|=n$ vertices and $|E|=m$ edges. Thus, the input size is $m+n$ . Consider the $(\gamma,\delta)$ -MPC model for $\gamma\in[0,2]$ and an arbitrary constant $\delta\in(0,1)$ . The total space in the system should be $\Theta(m^{1+\gamma})$ and the local memory size of each machine is $\Theta(m^{\delta})$ . There is an efficient algorithm for solving connected components and spanning tree problem.

Theorem 5.3 ([4]).

For any $\gamma\in[0,2]$ and any constant $\delta\in(0,1)$ , there is a randomized $(\gamma,\delta)$ -MPC algorithm which outputs the connected components together with a rooted spanning forest of an undirected graph $G$ with $n$ vertices and $m$ edges in $O(\min(\log\operatorname{diam}(G)\cdot\log\frac{\log n}{\log((n+m)^{1+\gamma}/n)},\log n))$ parallel time. Furthermore, the depth of the spanning forest is at most $\min\left(\operatorname{diam}(G)^{O\left(\log\frac{\log n}{\log((n+m)^{1+\gamma}/n)}\right)},n\right)$ . The success probability is at least $0.98$ . If the algorithm fails, then it returns FAIL.

$2$ -Edge connectivity. In the first step of Bridges $(G)$ (Section 3.1), according to Theorem 5.3, with probability $0.98$ , the rooted spanning tree of $G$ can be computed in the MPC model with total space $O(m^{1+\gamma})$ in $O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ time, and the depth of the spanning tree is at most $\operatorname{diam}(G)^{O(\log\log_{m^{1+\gamma}/n}n)}$ . In step 2, to compute $\operatorname{lev}(v)$ for each $v\in V$ , we can query the LCA of $(v,w)$ in $\operatorname{par}$ for each edge $(v,w)\in E$ . We can use our LCA algorithm (Section 4.3) as the subroutine for this purpose. It takes the total space $O(m)$ and the running time $O(\log(\operatorname{dep}(\operatorname{par})))=O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ (Section 4.5). In step 3, with probability at least $0.99$ , the DFS sequence can be computed using $O(n)$ total space in time $O(\log(\operatorname{dep}(\operatorname{par})))=O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ (Theorem 1.3). In step 4, we can use sorting to find the first appearance $a_{i}$ and the last appearance $a_{j}$ in the DFS sequence of each vertex $v$ , and $\min_{k\in\{i,i+1,\cdots,j\}}\operatorname{lev}(a_{k})$ corresponds to a range minimum query. If the size of the subtree of $v$ is at most $\log n$ , the corresponding RMQ can be solved by local computation. Otherwise, we use our RMQ algorithm (Section 5.1) to handle the corresponding RMQ of $v$ . By Lemma 5.1, this step only takes $O(1)$ time and requires $O(n)$ space. To conclude, Bridges $(G)$ only takes total space $O(m^{1+\gamma})$ and has parallel time $O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ .

Since the correctness of Bridges $(G)$ (Section 3.1) is guaranteed by Lemma 3.1, we can conclude Theorem 1.2.

Biconnectivity. The first three steps of Biconn $(G)$ (Section 3.2) are the same as the first three steps of Bridges $(G)$ (Section 3.1). Thus, the success probability of the first three steps is at least $0.97$ . The total space used is at most $O(m^{1+\gamma})$ and the running time is at most $O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ . Step 5 of Biconn $(G)$ corresponds to the RMQ problem which is almost the same as the step 4 of Bridges $(G)$ . Thus, it takes $O(n)$ total space and $O(1)$ parallel time. Step 6 requires $m$ LCA queries. We can run our LCA algorithm (Section 4.3) for this step. It takes $O(m+n)$ space and $O(\log(\operatorname{dep}(\operatorname{par})))=O(\log\operatorname{diam}(G)\cdot\log\log_{m^{1+\gamma}/n}n)$ time (Section 4.5). By Lemma 3.3, we have $\operatorname{diam}(G^{\prime})\leq\operatorname{diam}(G)^{O(\log\log_{m^{1+\gamma}/n}n)}\cdot\operatorname{bi-diam}(G)$ . According to Theorem 5.3, with probability at least $0.98$ , the connected components of $G^{\prime}$ can be computed in step 7, the total space needed is $O(m^{1+\gamma})$ , and the running time is $O(\log\operatorname{diam}(G)\log^{2}\log_{m^{1+\gamma}/n}n+\log\operatorname{bi-diam}(G)\log\log_{m^{1+\gamma}/n}n)$ . To conclude, the total space needed is at most $O(m^{1+\gamma})$ , and the parallel running time is $O(\log\operatorname{diam}(G)\log^{2}\log_{m^{1+\gamma}/n}n+\log\operatorname{bi-diam}(G)\log\log_{m^{1+\gamma}/n}n)$ .

Since the correctness of Biconn $(G)$ (Section 3.2) is guaranteed by Lemma 3.3, we can conclude Theorem 1.1.

6 Hardness of Biconnectivity in MPC

There is a conjectured hardness result which is widely used in the MPC literature [26, 11, 29, 35, 41].

Conjecture 6.1 (One cycle vs. two cycles).

For any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , distinguishing the following two graph instances in the $(\gamma,\delta)$ -MPC model requires $\Omega(\log n)$ parallel time:

a single cycle contains $n$ vertices, 2. 2.

two disjoint cycles, each contains $n/2$ vertices.

Under the above conjecture, we show that $\Omega(\log\operatorname{bi-diam}(G))$ parallel time is necessary to compute the biconnected components of $G$ . This claim is true even for the constant diameter graph $G$ , i.e., $\operatorname{diam}(G)=O(1)$ .

Theorem 6.2 (Hardness of biconnectivity in MPC).

For any $\gamma\geq 0$ and any constant $\delta\in(0,1)$ , unless the one cycle vs. two cycles conjecture (Conjecture 6.1) is false, any $(\gamma,\delta)$ -MPC algorithm requires $\Omega(\log\operatorname{bi-diam}(G))$ parallel time for testing whether a graph $G$ with a constant diameter is biconnected.

Proof 6.3.

For $\gamma\geq 0$ and an arbitrary constant $\delta\in(0,1)$ , suppose there is a $(\gamma,\delta)$ -MPC algorithm $\mathcal{A}$ which can determine whether an arbitrary constant diameter graph $G$ is biconnected in $o(\log\operatorname{bi-diam}(G))$ parallel time. Then we give a $(\gamma,\delta)$ -MPC algorithm for solving one cycle vs. two cycles problem as the following:

For a one cycle vs. two cycles instance $n$ -vertex graph $G^{\prime}=(V^{\prime},E^{\prime})$ , construct a new graph $G=(V,E)$ : $V=V^{\prime}\cup\{v^{*}\},E=E^{\prime}\cup\{(v,v^{*})\mid v\in V^{\prime}\}$ . 2. 2.

Run $\mathcal{A}$ on $G$ . If $G$ is not biconnected, $G^{\prime}$ contains two cycles. Otherwise $G^{\prime}$ is a single cycle.

It is easy to see that the diameter of $G$ is $2$ . If $G^{\prime}$ is a single cycle, then $G$ is biconnected and $\operatorname{bi-diam}(G)=\Theta(n)$ . If $G^{\prime}$ contains two cycles, then $G$ contains two biconnected components and $\operatorname{bi-diam}(G)=\Theta(n)$ .

The first step of the above algorithm takes $O(1)$ parallel time and only requires linear total space. The graph $G$ has $n+1$ vertices and $2n$ edges. Thus, the above algorithm is also a $(\gamma,\delta)$ -MPC algorithm. The parallel time of the above algorithm is the same as the time needed for running $\mathcal{A}$ on $G$ which is $o(\log\operatorname{bi-diam}(G))=o(\log n)$ . Thus the existence of the algorithm $\mathcal{A}$ implies that the one cycle vs. two cycles conjecture (Conjecture 6.1) is false.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Kook Jin Ahn and Sudipto Guha. Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints. ACM Transactions on Parallel Computing (TOPC) , 4(4):17, 2018.
2[2] Noga Alon, László Babai, and Alon Itai. A fast and simple randomized parallel algorithm for the maximal independent set problem. Journal of algorithms , 7(4):567–583, 1986.
3[3] Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel algorithms for geometric graph problems. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages 574–583. ACM, 2014.
4[4] Alexandr Andoni, Zhao Song, Clifford Stein, Zhengyu Wang, and Peilin Zhong. Parallel graph connectivity in log diameter rounds. In FOCS . https://arxiv.org/pdf/1805.03055 , 2018.
5[5] Sepehr Assadi, Mohammad Hossein Bateni, Aaron Bernstein, Vahab Mirrokni, and Cliff Stein. Coresets meet edcs: algorithms for matching and vertex cover on massive graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1616–1635. SIAM, 2019.
6[6] Sepehr Assadi and Sanjeev Khanna. Randomized composable coresets for matching and vertex cover. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures , pages 3–12. ACM, 2017.
7[7] Sepehr Assadi, Xiaorui Sun, and Omri Weinstein. Massively parallel algorithms for finding well-connected components in sparse graphs. In Ar Xiv preprint . https://arxiv.org/pdf/1805.02974 , 2018.
8[8] Giorgio Ausiello, Donatella Firmani, Luigi Laura, and Emanuele Paracone. Large-scale graph biconnectivity in mapreduce. Department of Computer and System Sciences Antonio Ruberti Technical Reports , 4(4), 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Acknowledgements.

Log Diameter Rounds Algorithms for 222-Vertex and 222-Edge Connectivity

Abstract

keywords:

category:

1 Introduction

1.1 The Model

1.2 Our Results

Theorem 1.1** (Biconnectivity in MPC).**

Theorem 1.2** (222-Edge connectivity in MPC).**

Theorem 1.3** (DFS sequence of a tree in MPC).**

Theorem 1.4** (Hardness of biconnectivity in MPC).**

1.3 Our Techniques

1.4 A Roadmap

2 Preliminaries

2.1 Depth-First-Search Sequence

Definition 2.1** (Subtree [4]).**

Definition 2.2** (DFS sequence [4]).**

2.2 Data Organization and Basic Algorithms in the MPC Model

Theorem 2.3** ([21, 22]).**

3 222-Edge Connectivity and Biconnectivity

3.1 222-Edge Connectivity

Lemma 3.1** (222-Edge connectivity).**

Proof 3.2**.**

3.2 Biconnectivity

Lemma 3.3** (Biconnectivity).**

Proof 3.4**.**

Claim 1**.**

Proof 3.5**.**

Claim 2**.**

Proof 3.6**.**

Claim 3**.**

Proof 3.7**.**

4 Parallel DFS Sequence in Linear Total Space

4.1 DFS Sequence via Leaf Sampling

Theorem 4.1** (Leaf sampling algorithm [4]).**

4.2 Compressed Rooted Tree

Lemma 4.2** (Properties of a compressed rooted tree).**

Proof 4.3**.**

4.3 Least Common Ancestor

Lemma 4.4** (LCA algorithm).**

Proof 4.5**.**

4.4 Multi-Paths Generation

Lemma 4.6** (Generation of multiple paths).**

Proof 4.7**.**

4.5 Implementation of the DFS Sequence Algorithm in MPC

5 222-Edge Connectivity and Biconnectivity in MPC

5.1 Parallel Range Minimum Query in Linear Total Space

Lemma 5.1** (Range minimum query).**

Proof 5.2**.**

5.2 MPC Implementation of 222-Edge Connectivity and Biconnectivity

Theorem 5.3** ([4]).**

6 Hardness of Biconnectivity in MPC

Conjecture 6.1** (One cycle vs. two cycles).**

Theorem 6.2** (Hardness of biconnectivity in MPC).**

Proof 6.3**.**

Log Diameter Rounds Algorithms for $2$ -Vertex and $2$ -Edge Connectivity

Theorem 1.1 (Biconnectivity in MPC).

Theorem 1.2 ( $2$ -Edge connectivity in MPC).

Theorem 1.3 (DFS sequence of a tree in MPC).

Theorem 1.4 (Hardness of biconnectivity in MPC).

Definition 2.1 (Subtree [4]).

Definition 2.2 (DFS sequence [4]).

Theorem 2.3 ([21, 22]).

3 $2$ -Edge Connectivity and Biconnectivity

3.1 $2$ -Edge Connectivity

Lemma 3.1 ( $2$ -Edge connectivity).

Proof 3.2.

Lemma 3.3 (Biconnectivity).

Proof 3.4.

Claim 1.

Proof 3.5.

Claim 2.

Proof 3.6.

Claim 3.

Proof 3.7.

Theorem 4.1 (Leaf sampling algorithm [4]).

Lemma 4.2 (Properties of a compressed rooted tree).

Proof 4.3.

Lemma 4.4 (LCA algorithm).

Proof 4.5.

Lemma 4.6 (Generation of multiple paths).

Proof 4.7.

5 $2$ -Edge Connectivity and Biconnectivity in MPC

Lemma 5.1 (Range minimum query).

Proof 5.2.

5.2 MPC Implementation of $2$ -Edge Connectivity and Biconnectivity

Theorem 5.3 ([4]).

Conjecture 6.1 (One cycle vs. two cycles).

Theorem 6.2 (Hardness of biconnectivity in MPC).

Proof 6.3.