Quantum Distributed Algorithm for the All-Pairs Shortest Path Problem in   the CONGEST-CLIQUE Model

Taisuke Izumi; Fran\c{c}ois Le Gall

arXiv:1906.02456·quant-ph·October 5, 2021

Quantum Distributed Algorithm for the All-Pairs Shortest Path Problem in the CONGEST-CLIQUE Model

Taisuke Izumi, Fran\c{c}ois Le Gall

PDF

TL;DR

This paper introduces a quantum distributed algorithm that significantly speeds up solving the All-Pairs Shortest Path problem in the CONGEST-CLIQUE model, surpassing classical limitations by leveraging quantum communication and search techniques.

Contribution

It presents the first quantum distributed algorithm for APSP in the CONGEST-CLIQUE model, breaking the classical $ ilde O(n^{1/3})$ round barrier to $ ilde O(n^{1/4})$ rounds.

Findings

01

Quantum algorithm achieves faster APSP computation in CONGEST-CLIQUE

02

Quantum communication offers advantages over classical in this model

03

Parallel quantum searches are efficiently implemented without congestion

Abstract

The All-Pairs Shortest Path problem (APSP) is one of the most central problems in distributed computation. In the CONGEST-CLIQUE model, in which $n$ nodes communicate with each other over a fully connected network by exchanging messages of $O (lo g n)$ bits in synchronous rounds, the best known general algorithm for APSP uses $\tilde{O} (n^{1/3})$ rounds. Breaking this barrier is a fundamental challenge in distributed graph algorithms. In this paper we investigate for the first time quantum distributed algorithms in the CONGEST-CLIQUE model, where nodes can exchange messages of $O (lo g n)$ quantum bits, and show that this barrier can be broken: we construct a $\tilde{O} (n^{1/4})$ -round quantum distributed algorithm for the APSP over directed graphs with polynomial weights in the CONGEST-CLIQUE model. This speedup in the quantum setting contrasts with the case of the standard CONGEST model,…

Equations91

Pr [Γ_{G^{'}} (u, v) > 90 lo g n] \leq exp (- \frac{60 lo g n}{12}) < \frac{1}{n ^{5}},

Pr [Γ_{G^{'}} (u, v) > 90 lo g n] \leq exp (- \frac{60 lo g n}{12}) < \frac{1}{n ^{5}},

Pr [Γ_{G^{'}} (u, v) = 0] = (1 - \frac{60 \cdot 2 ^{i} lo g n}{n})^{Γ_{G} (u, v)} < exp (- Γ_{G} (u, v) \times \frac{60 \cdot 2 ^{i} lo g n}{n}) < \frac{1}{n ^{30}}

Pr [Γ_{G^{'}} (u, v) = 0] = (1 - \frac{60 \cdot 2 ^{i} lo g n}{n})^{Γ_{G} (u, v)} < exp (- Γ_{G} (u, v) \times \frac{60 \cdot 2 ^{i} lo g n}{n}) < \frac{1}{n ^{30}}

A [i, k] + B [k, j] < D [i, j],

A [i, k] + B [k, j] < D [i, j],

k \in [n] min {A [i, k] + B [k, j]} < D [i, j] .

k \in [n] min {A [i, k] + B [k, j]} < D [i, j] .

A_{G}[i,j]=\left\{\begin{tabular}[]{ll}0&if $i=j$,\\ $w(i,j)$&if $i\neq j$ and $(i,j)\in E$,\\ $\infty$&if $i\neq j$ and $(i,j)\notin E$,\end{tabular}\right.

A_{G}[i,j]=\left\{\begin{tabular}[]{ll}0&if $i=j$,\\ $w(i,j)$&if $i\neq j$ and $(i,j)\in E$,\\ $\infty$&if $i\neq j$ and $(i,j)\notin E$,\end{tabular}\right.

A_{1}^{1} \times \dots \times A_{m}^{1} \subseteq Υ_{β /2} (m, X) .

A_{1}^{1} \times \dots \times A_{m}^{1} \subseteq Υ_{β /2} (m, X) .

\mathbb{E}\left[\big{|}\{v\in\bm{v}\>|\>\{u,v\}\in\Lambda_{x}(\bm{u},\bm{v})\}\big{|}\right]=10n^{1/4}\log n

\mathbb{E}\left[\big{|}\{v\in\bm{v}\>|\>\{u,v\}\in\Lambda_{x}(\bm{u},\bm{v})\}\big{|}\right]=10n^{1/4}\log n

(1 - \frac{10 lo g n}{n})^{n} \leq 1/ n^{4} .

(1 - \frac{10 lo g n}{n})^{n} \leq 1/ n^{4} .

w \in w min {f (u_{ℓ}^{k}, w) + f (w, v_{ℓ}^{k})} \leq f (u_{ℓ}^{k}, v_{ℓ}^{k})

w \in w min {f (u_{ℓ}^{k}, w) + f (w, v_{ℓ}^{k})} \leq f (u_{ℓ}^{k}, v_{ℓ}^{k})

\Delta(\bm{u},\bm{v};\bm{w})=\big{\{}\{u,v\}\in\mathcal{P}(\bm{u},\bm{v})\cap S\>|\>\exists w\in\bm{w}\textrm{ such that $\{u,v,w\}$ is a negative triangle in $G$}\big{\}}.

\Delta(\bm{u},\bm{v};\bm{w})=\big{\{}\{u,v\}\in\mathcal{P}(\bm{u},\bm{v})\cap S\>|\>\exists w\in\bm{w}\textrm{ such that $\{u,v,w\}$ is a negative triangle in $G$}\big{\}}.

d_{\bm{u}\bm{v}\bm{w}}=\Big{|}\big{\{}\{u,v\}\in\mathcal{P}(\bm{u},\bm{v})\cap R\>\>|\>\>\exists w\in\bm{w}\textrm{ s.t. $\{u,v,w\}$ is a negative triangle in $G$}\big{\}}\Big{|}

d_{\bm{u}\bm{v}\bm{w}}=\Big{|}\big{\{}\{u,v\}\in\mathcal{P}(\bm{u},\bm{v})\cap R\>\>|\>\>\exists w\in\bm{w}\textrm{ s.t. $\{u,v,w\}$ is a negative triangle in $G$}\big{\}}\Big{|}

E [X_{u}] = \frac{10 lo g n \times ∣ { v \in V ∣ { u , v } \in S } ∣}{n} \leq 10 lo g n .

E [X_{u}] = \frac{10 lo g n \times ∣ { v \in V ∣ { u , v } \in S } ∣}{n} \leq 10 lo g n .

Pr [X_{i} \geq 60 lo g n] < \frac{1}{n ^{2}} .

Pr [X_{i} \geq 60 lo g n] < \frac{1}{n ^{2}} .

E [δ_{u, v, w}] = \frac{10 lo g n \times ∣Δ ( u , v ; w ) ∣}{n} .

E [δ_{u, v, w}] = \frac{10 lo g n \times ∣Δ ( u , v ; w ) ∣}{n} .

Pr [δ_{u, v, w} \geq 10 lo g n] < 2^{- 10 l o g n} < \frac{1}{n ^{2}} .

Pr [δ_{u, v, w} \geq 10 lo g n] < 2^{- 10 l o g n} < \frac{1}{n ^{2}} .

Pr [δ_{u, v, w} \geq 10 \cdot 2^{c - 1} lo g n] \leq Pr [δ_{u, v, w} \geq 4 E [δ_{u, v, w}]] \leq exp (- \frac{90 lo g n}{12}) < \frac{1}{n ^{2}} .

Pr [δ_{u, v, w} \geq 10 \cdot 2^{c - 1} lo g n] \leq Pr [δ_{u, v, w} \geq 4 E [δ_{u, v, w}]] \leq exp (- \frac{90 lo g n}{12}) < \frac{1}{n ^{2}} .

Pr [δ_{u, v, w} < 10 \cdot 2^{c} lo g n] \leq

Pr [δ_{u, v, w} < 10 \cdot 2^{c} lo g n] \leq

\leq

T_{α} [u, v] = {w \in V^{'} ∣ (u, v, w) \in T_{α}}

T_{α} [u, v] = {w \in V^{'} ∣ (u, v, w) \in T_{α}}

∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣ \leq 100 \cdot 2^{α} n lo g n

∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣ \leq 100 \cdot 2^{α} n lo g n

E [∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣] = ∣Δ (u, v; w) ∣ \times \frac{10 lo g n}{n} \leq 10 \cdot 2^{α + 1} n lo g n .

E [∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣] = ∣Δ (u, v; w) ∣ \times \frac{10 lo g n}{n} \leq 10 \cdot 2^{α + 1} n lo g n .

∣ T_{α} [u, v] ∣ \leq \frac{720 n lo g n}{2 ^{α}} .

∣ T_{α} [u, v] ∣ \leq \frac{720 n lo g n}{2 ^{α}} .

w \in T_{α} [u, v] \sum Δ (u, v; w) \leq 90 n^{3/2} lo g n .

w \in T_{α} [u, v] \sum Δ (u, v; w) \leq 90 n^{3/2} lo g n .

w \in w min {f (u, w) + f (w, v)} \leq f (u, v)

w \in w min {f (u, w) + f (w, v)} \leq f (u, v)

∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣ \leq 100 \cdot 2^{α} n lo g n .

∣ Λ_{x} (u, v) \cap Δ (u, v; w) ∣ \leq 100 \cdot 2^{α} n lo g n .

L_{w, 1}^{k}, L_{w, 2}^{k}, \dots, L_{w, 2^{α} / (720 l o g n)}^{k}

L_{w, 1}^{k}, L_{w, 2}^{k}, \dots, L_{w, 2^{α} / (720 l o g n)}^{k}

w \in w min {f (u, w) + f (w, v)} \leq f (u, v)

w \in w min {f (u, w) + f (w, v)} \leq f (u, v)

∣ T_{α} [u, v] ∣ \times (2^{α} / (720 lo g n)) \leq n

∣ T_{α} [u, v] ∣ \times (2^{α} / (720 lo g n)) \leq n

∣ ψ^{b} ⟩ = ∣ ψ_{1}^{b_{1}} ⟩ \otimes \dots \otimes ∣ ψ_{m}^{b_{m}} ⟩,

∣ ψ^{b} ⟩ = ∣ ψ_{1}^{b_{1}} ⟩ \otimes \dots \otimes ∣ ψ_{m}^{b_{m}} ⟩,

∣ ψ_{i}^{0} ⟩ = \frac{1}{∣ A _{i}^{0} ∣} x \in A_{i}^{0} \sum ∣ x ⟩ and ∣ ψ_{i}^{1} ⟩ = \frac{1}{∣ A _{i}^{1} ∣} x \in A_{i}^{1} \sum ∣ x ⟩

∣ ψ_{i}^{0} ⟩ = \frac{1}{∣ A _{i}^{0} ∣} x \in A_{i}^{0} \sum ∣ x ⟩ and ∣ ψ_{i}^{1} ⟩ = \frac{1}{∣ A _{i}^{1} ∣} x \in A_{i}^{1} \sum ∣ x ⟩

∣ Φ_{0}^{m} ⟩ = \frac{1}{∣ X ∣ ^{m}} (x_{1}, \dots, x_{m}) \in X^{m} \sum ∣ x_{1} ⟩ \otimes \dots \otimes ∣ x_{m} ⟩,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Quantum Distributed Algorithm for the All-Pairs Shortest Path Problem in the CONGEST-CLIQUE Model

Taisuke Izumi

Graduate School of Engineering

Nagoya Institute of Technology

[email protected]

François Le Gall

Graduate School of Informatics

Kyoto University

[email protected]

Abstract

The All-Pairs Shortest Path problem (APSP) is one of the most central problems in distributed computation. In the CONGEST-CLIQUE model, in which $n$ nodes communicate with each other over a fully connected network by exchanging messages of $O(\log n)$ bits in synchronous rounds, the best known general algorithm for APSP uses $\tilde{O}(n^{1/3})$ rounds. Breaking this barrier is a fundamental challenge in distributed graph algorithms. In this paper we investigate for the first time quantum distributed algorithms in the CONGEST-CLIQUE model, where nodes can exchange messages of $O(\log n)$ quantum bits, and show that this barrier can be broken: we construct a $\tilde{O}(n^{1/4})$ -round quantum distributed algorithm for the APSP over directed graphs with polynomial weights in the CONGEST-CLIQUE model. This speedup in the quantum setting contrasts with the case of the standard CONGEST model, for which Elkin et al. (PODC 2014) showed that quantum communication does not offer significant advantages over classical communication.

Our quantum algorithm is based on a relationship discovered by Vassilevska Williams and Williams (JACM 2018) between the APSP and the detection of negative triangles in a graph. The quantum part of our algorithm exploits the framework for quantum distributed search recently developed by Le Gall and Magniez (PODC 2018). Our main technical contribution is a method showing how to implement multiple quantum searches (one for each edge in the graph) in parallel without introducing congestions.

1 Introduction

Background. The CONGEST-CLIQUE model is a model in distributed computing that has recently been the subject of intensive research [4, 8, 9, 18, 19, 20, 21, 29, 30, 31, 32, 35, 25, 34, 16]. In this model $n$ nodes communicate with each other over a fully connected network (i.e., a clique) by exchanging messages of $O(\log n)$ bits in synchronous rounds. Compared with the more traditional CONGEST model [36], the CONGEST-CLIQUE model removes the effect of distances between nodes in the computation and thus focuses solely on understanding the role of congestion in distributed computing.

The study of shortest path problems is one of the central topics in the context of distributed graph algorithms. In the CONGEST model, much progress has been done in the past years [29, 22, 32, 21, 15, 23, 12, 2, 1, 10, 4]: while for exact computation of the Single-Source Shortest Path problem (SSSP) there is still a small gap between the upper bounds and the lower bounds [37, 12], for the All-Pairs Shortest Path problem (APSP) an algorithm with optimal time complexity (up to possible polylogarithmic factors) has been constructed very recently [2]. In the CONGEST-CLIQUE model, the first non-trivial result attaining sublinear running time was an algorithm by Nanongkai [32], which solves the $(2+\epsilon)$ -approximate APSP over undirected weighted graphs within $\tilde{O}(\sqrt{n})$ rounds. This was improved by Censor-Hillel at al. [4], who gave a $\tilde{O}(n^{1/3})$ -round exact algorithm for the general APSP (i.e., the APSP over directed graphs with polynomial weights). While faster algorithms based on fast matrix multiplication have been designed for the APSP over graphs with small weights or for approximating the shortest paths [4, 26], the above $\tilde{O}(n^{1/3})$ -round is still not only the best known exact algorithm for the general APSP, but also the best known exact algorithm for SSSP in the CONGEST-CLIQUE model.

Quantum distributed computing. The power of distributed network computation in the quantum CONGEST model has been first investigated by Elkin et al. [11]. In this model the nodes can use quantum processing and communicate using quantum bits (qubits): each edge of the network corresponds to a quantum channel (e.g., an optical fiber if qubits are implemented using photons) of bandwidth $O(\log n)$ qubits. Their main conclusion was that for many fundamental problems in distributed computing, including the computation of the $s$ - $t$ shortest path in weighted graphs, quantum communication does not offer significant advantages over classical communication. A significant development recently happened: Le Gall and Magniez [27] constructed a quantum distributed algorithm in the CONGEST model computing the exact diameter within $\tilde{O}(\sqrt{nD})$ rounds, where $D$ denotes the diameter. Since Frischknecht et al. [13] have shown that any classical algorithm requires $\tilde{\Omega}(n)$ rounds, even in the case $D=O(1)$ , this gives a speedup (up to quadratic when the diameter is small). At the core of this quantum algorithm lies a distributed implementation of Grover’s seminal quantum algorithm [17]. Grover’s algorithm achieves a quadratic speedup over brute-force search for generic search problems in the centralized setting. The algorithm from [27] carefully adapts Grover’s algorithm to the distributed CONGEST model and shows how to combine it with a classical distributed algorithm in a completely black-box way. Due to its versatility, this approach has the potential of accelerating many graph algorithms. A pressing open question is to understand for which problems in distributed computing it can actually help.

Our result. While it is tempting to consider potential quantum acceleration of computing shortest paths using the distributed version of Grover’s algorithm, there are several significant obstacles. The distributed quantum diameter algorithm from [27] crucially relies on reducing the computation of the diameter to the search problem of finding a node with the maximum eccentricity, and this strategy does not directly work for shortest path problems. Indeed, as already mentioned, it is known that for $s$ - $t$ shortest paths over weighted graphs, quantum communication cannot offer any significant speedup in the CONGEST model. Even over unweighted graphs, in the CONGEST model it is easy to extend the classical lower bound from [13] to show a $\tilde{\Omega}(n)$ -round lower bound for the APSP that holds even in the quantum setting.

In this paper we show that a speedup is possible in the CONGEST-CLIQUE model. Our main result is the following theorem.

Theorem 1.

There is a quantum algorithm that solves with high probability the All-Pairs Shortest Path problem over directed graphs with integer weights in $\{-W,\ldots,W\}$ using $\tilde{O}(n^{1/4}\log W)$ rounds in the CONGEST-CLIQUE model.

As already mentioned, the best known upper bound for the APSP in the classical CONGEST-CLIQUE is due to Censor-Hillel at al. [4]: for graphs with integer weights in $\{-W,\ldots,W\}$ their upper bound is $\tilde{O}(n^{1/3}\log W)$ rounds. While no nontrivial lower bound is known on the classical complexity of APSP in the CONGEST-CLIQUE model, which is not surprising due to the technical challenges of proving any nontrivial lower bound in this model, the current $\tilde{O}(n^{1/3}\log W)$ bound appears as a significant barrier for classical algorithms. Our quantum algorithm breaks this barrier. This gives strong evidence for the superiority of quantum distributed computing over classical distributed computing in the CONGEST-CLIQUE model as well (unless the barrier can be broken in the classical setting as well — this would in itself be a significant breakthrough). Another interesting observation is that this quantum speedup occurs for a problem (the APSP) for which no quantum speedup can be achieved in the standard CONGEST model, as already mentioned.

Technical overview. The first step of our approach consists in reducing the APSP problem to the problem of detecting negative triangles (triangles in which the sum of the weights of the three edges is negative). This reduction is inspired by the recent breakthrough by Vassilevska Williams and Williams [39] in centralized algorithms that revealed the relationship between the APSP and triangle detection, via the computation of the distance product of two matrices. More precisely, our approach reduces the APSP to the problem of identifying all the edges of the graph that are involved in (at least) one negative triangle, under the promise that each edge is involved in at most $O(\log n)$ negative triangles.

In order to solve the latter problem, we would like to design an algorithm running a quadratic number of instances of negative-triangle detection simultaneously, since in the worst case $\Theta(n^{2})$ edges involved in negative triangles need to be detected. Due to the fact that the query sequence generated by a single run of the distributed version of Grover’s algorithm is a quantum superposition, a naive parallelization would result in high congestion of query messages, causing delays and degradation of the running time. To overcome this difficulty, we develop a novel machinery ensuring that all the parallel runs of the quantum searches are fairly load balanced, which resolves the problem of congestions. This is done by analyzing carefully the error probability of multiple quantum searches and showing that (for the problem considered) ignoring the queries that are not load balanced does not decrease significantly the success probability.

Other related works. As already mentioned, triangle detection and matrix multiplication are closely related to the APSP problem. There are several results considering those problems in the CONGEST or CONGEST-CLIQUE models [8, 4, 26, 24, 33, 5, 6]. In the CONGEST-CLIQUE model, in particular, an $\tilde{O}(n^{1/3})$ -round algorithm for listing all triangles is proposed by Dolev et al. [8]. This algorithm is combinatorial (i.e., non-algebraic) and thus works for listing negative triangles as well. Combined with our reduction from APSP to negative triangles, this can be used to construct a classical distributed APSP algorithm in the CONGEST-CLIQUE with the same complexity $\tilde{O}(n^{1/3}\log W)$ as the algorithm by Censor-Hillel [4]. While there exist faster algorithms for triangle detection [4, 9, 26], all these faster algorithms are based on an algebraic approach (more precisely, a reduction to matrix multiplication over a ring), and cannot be used to find negative triangles (which corresponds to matrix multiplication over a semiring).

To our knowledge the present work is the first to consider the quantum CONGEST-CLIQUE model. We already mentioned the prior works [11, 27] on the quantum CONGEST model. Besides the vast literature on two-party quantum communication complexity (see, e.g., [40, 3, 7]), there exist a few works that considered other settings in quantum distributed computing. First, exact quantum protocols for leader election in anonymous networks have been developed by Tani et al. [38]. Gavoille et al. [14] then considered quantum distributed computing in the LOCAL model, and showed that for several fundamental problems, allowing quantum communication does not lead to any significant advantage. Very recently Le Gall et al. [28] showed that there nevertheless exist some computational problems for which quantum distributed computing can be much more powerful than classical distributed computing in the LOCAL model.

2 Preliminaries

General notations.

Given any positive integer $p$ , we use the notation $[p]$ to represent the set $\{1,2,\ldots,p\}$ . Given a graph $G=(V,E)$ and any two sets $U,U^{\prime}\subseteq V$ , we write $\mathcal{P}(U,U^{\prime})$ the set of pairs of vertices $\{u,v\}$ with $u\in U$ , $v\in U^{\prime}$ and $u\neq v$ . When $U^{\prime}=U$ we simply write $\mathcal{P}(U)=\mathcal{P}(U,U)$ . Finally, for any vertex $v\in V$ we write $\mathcal{N}_{G}(v)$ the set of neighbors of $v$ .

Quantum CONGEST-CLIQUE model.

Recent definitions of the quantum CONGEST model [27] and the quantum LOCAL model [28] are obtained by starting with the corresponding classical model (classical CONGEST model and LOCAL model, respectively) and simply allowing nodes to send quantum information instead of classical information. In this paper we use the same approach to define a natural quantum version of the CONGEST-CLIQUE model.

In the classical CONGEST-CLIQUE model, $n$ nodes communicate with each other over a fully connected network by exchanging messages of $O(\log n)$ bits in synchronous rounds. All links and nodes (corresponding to the edges and vertices of $G$ , respectively) are reliable and suffer no faults. Each node has a distinct identifier. In the quantum CONGEST-CLIQUE model the only difference is that the nodes can exchange quantum information: each message exchanged consists of $O(\log n)$ quantum bits instead of $O(\log n)$ bits in the classical case. In particular, initially the nodes of the network do not share any entanglement.

This paper will describe many classical algorithms and procedures that will be used for pre-processing and post-processing (or even used inside the main quantum part as a subprocedure). We will use many times (sometimes implicitly) the following Lemma by Dolev et al. [8].

Lemma 1.

[8]** In the CONGEST-CLIQUE model a set of messages in which no node is the source of more than $n$ messages and no node is the destination of more than $n$ messages can be delivered within two rounds if the source and destination of each message is known in advance to all nodes.

Graph-theoretic problems in the CONGEST-CLIQUE model.

When studying graph-theoretic problems such as the APSP problem in the classical or quantum CONGEST-CLIQUE model, the input is a graph $G=(V,E)$ consisting of $n$ nodes, i.e., the number of nodes of the graph is the same as the number of nodes of the communication network. This means that we can assign to each node of the network a distinct label $u\in V$ . The input is given as follows: each node with label $u$ of the network receives the row of the adjacency matrix of $G$ corresponding to vertex $u$ of $G$ . The result of the computation is defined similarly: for the APSP the node with label $u$ should output the shortest distance from $u$ to all the other nodes in $G$ . We refer to [4] for details.

3 APSP and negative triangles

In this section we show how to reduce the APSP to finding all the edges involved in a negative triangle. We first define the latter problem and state the main technical result of this paper (Theorem 2). Then we show the computation of the distance product of a matrix reduces to this problem. Finally, we recall the standard reduction from APSP to the computation of the distance product and derive Theorem 1 from Theorem 2.

Finding the edges in negative triangles.

Consider an undirected weighted graph $G=(V,E,f)$ with weight function $f\colon E\to\mathbb{Z}$ . For an edge $\{u,v\}\in E$ , we use the notation $f(u,v)$ instead of $f(\{u,v\})$ .

Definition 1.

Given three vertices $u,v,w\in V$ , we say that the triple $\{u,v,w\}$ is a negative triangle in $G$ if $\{u,v\}$ , $\{u,w\}$ and $\{v,w\}$ are edges and the inequality $f(u,v)+f(u,w)+f(v,w)<0$ holds.

For any pair $\{u,v\}\in\mathcal{P}(V)$ , we use the notation $\Gamma_{G}(u,v)$ to denote the number of negative triangles involving $\{u,v\}$ , i.e., $\Gamma_{G}(u,v)=\left|\left\{w\in V\>|\>\{u,v,w\}\textrm{ is a negative triangle in }G\right\}\right|.$ We simply write $\Gamma(u,v)$ when the graph $G$ is clear from the context.

We now define the main problem considered in this paper. This problem, which we denote FindEdges, asks to compute the list of all edges involved in a negative triangle. The formal definition of the problem is as follows.

FindEdges

Input: an undirected weighted graph $G=(V,E,f)$ distributed among the $n$ nodes

of the network (each node $u$ gets $\mathcal{N}_{G}(u)$ )

Output: each node $u$ outputs the list of all pairs $\{u,v\}\in\mathcal{P}(V)$ such that $\Gamma(u,v)>0$

Let us now consider the version of this problem in which we have the promise $\Gamma(u,v)=O(\log n)$ for all pairs $\{u,v\}$ . It will actually be convenient to define a more general problem where there is an additional input $S\subseteq\mathcal{P}(V)$ , the promise only holds for the pairs in $S$ and we only require each node to output the edges in $S$ that are involved in a negative triangle. The definition of this version with promise, which we call FindEdgesWithPromise, follows.

FindEdgesWithPromise

Input: an undirected weighted graph $G=(V,E,f)$ and a set $S\subseteq\mathcal{P}(V)$ distributed

among the $n$ nodes of the network

(each node $u$ gets $\mathcal{N}_{G}(u)$ and the list of all pairs in $S$ containing $u$ )

Promise: $\Gamma(u,v)\leq 90\log n$ for all pairs $\{u,v\}\in S$

Output: each node $u$ outputs the list of all pairs $\{u,v\}\in S$ such that $\Gamma(u,v)>0$

It is not difficult to show a randomized reduction from solving FindEdges to solving $O(\log n)$ instances of FindEdgesWithPromise. We state this reduction in the following proposition.

Proposition 1.

Assume there exists a $T(n)$ -round algorithm that solves FindEdgesWithPromise with probability at least $1-\varepsilon$ for some $\varepsilon>0$ . Then there exists a $O(T(n)\log n)$ -round algorithm that solves the problem FindEdges with probability at least $1-O((\varepsilon+1/n^{3})\log n)$ .

Proof.

Let $\mathcal{A}$ denote the $T(n)$ -round algorithm for FindEdgesWithPromise. We construct an algorithm for FindEdges as follows.

$S\leftarrow\mathcal{P}(V)$ ; $M\leftarrow\emptyset$ ; $i\leftarrow 0$ .

2.

While $60\cdot 2^{i}\log n\leq n$ do:

2.1.

Sample each edge of $G$ with probability $\sqrt{\frac{60\cdot 2^{i}\log n}{n}}$ . Let $G^{\prime}$ be the subgraph of $G$ consisting only of the sampled edges.

2.2.

Apply the algorithm $\mathcal{A}$ on input $(G^{\prime},S)$ . Let $S^{\prime}$ be the output of the algorithm.

2.3.

$S\leftarrow S\setminus S^{\prime}$ ; $M\leftarrow M\cup S^{\prime}$ ; $i\leftarrow i+1$ .

3.

Apply the algorithm $\mathcal{A}$ on input $(G,S)$ . Let $S^{\prime\prime}$ be the output of the algorithm.

4.

Output $M\cup S^{\prime\prime}$ .

Let us call Algorithm $\mathcal{B}$ the algorithm we just described. Its round complexity is $O(T(n)\log n)$ .

Let us first analyze Algorithm $\mathcal{B}$ under the assumption that Algorithm $\mathcal{A}$ never makes any error. We will prove below by induction the following invariant for the while loop: when testing the exit condition “ $60\cdot 2^{i}\log n\leq n$ ” at Step 2 of the while loop for some value $i$ , we have $\Gamma_{G}(u,v)\leq n/2^{i}$ for all $\{u,v\}\in S$ , and all the pairs $\{u,v\}\in\mathcal{P}(V)$ such that $\Gamma_{G}(u,v)>n/2^{i}$ are already contained in $M$ . This shows that at the end of the while loop we have $\Gamma_{G}(u,v)\leq n/2^{c}$ for all $\{u,v\}\in S$ , and all the pairs $\{u,v\}$ such that $\Gamma_{G}(u,v)>n/2^{c}$ are contained in $M$ , where $c$ is the smallest integer such that $60\cdot 2^{c}\log n>n$ . Since $n/2^{c}<90\log n$ , the call to Algorithm $\mathcal{A}$ at Step 3 then finds all the remaining pairs involved in negative triangles and the output at Step 4 is precisely the output of FindEdges.

The loop invariant is obviously satisfied for $i=0$ . Now assume that it is satisfied when $i=k$ for some $k\geq 0$ and let us consider what is happening at Step 2.2. Consider any pair $\{u,v\}\in S$ . From the induction hypothesis we have $\Gamma_{G}(u,v)<n/2^{i}$ . Note that $\mathbb{E}[\Gamma_{G^{\prime}}(u,v)]=\Gamma_{G}(u,v)\times\frac{60\cdot 2^{i}\log n}{n}\leq 60\log n$ . Chernoff’s bound then implies

[TABLE]

which means that with probability at least $1-1/n^{3}$ the promise required to execute Algorithm $\mathcal{A}$ is satisfied for all $\{u,v\}\in S$ . Let us now consider a pair $\{u,v\}\in S$ such that the inequality $\Gamma_{G}(u,v)>n/2^{i+1}$ holds. We have

[TABLE]

and thus the pair $\{u,v\}$ is included in the output $S^{\prime}$ of Algorithm $\mathcal{A}$ , and thus removed from $S$ (and added to $M$ ) at Step 2.3, with high probability. This proves that the loop invariant is satisfied for $i=k+1$ as well with probability at least $1-1/n^{3}-1/n^{28}$ .

We have thus shown that under the assumption that Algorithm $\mathcal{A}$ never makes any error, our algorithm solves FindEdges with probability at least $1-c/n^{3}-c/n^{28}$ . Since the error probability of Algorithm $\mathcal{A}$ is at most $\varepsilon$ and $\mathcal{A}$ is applied $c+1$ times, the union bound implies that our algorithm solves the problem FindEdges with probability at least $1-c/n^{3}-c/n^{28}-(c+1)\varepsilon=1-O((\varepsilon+1/n^{3})\log n)$ . ∎

The main technical contribution of this paper is the following theorem, which is proved in Section 5.

Theorem 2.

There is a $\tilde{O}(n^{1/4})$ -round quantum algorithm that solves with probability $1-O(1/n)$ the problem FindEdgesWithPromise in the CONGEST-CLIQUE model.

From distance products to negative triangles.

We first recall the definition of the distance product of two matrices.

Definition 2.

*Let $A$ and $B$ be two $n\times n$ matrices with entries in $\mathbb{Z}\cup\{-\infty,\infty\}$ . The distance product of $A$ and $B$ , denoted $A\star B$ , is the $n\times n$ matrix $C$ such that $C[i,j]=\min_{k\in[n]}\{A[i,k]+B[k,j]\}$ for all $(i,j)\in[n]\times[n]$ . *

Vassilevska Williams and Williams [39] proved a reduction from the computation of the distance product of two $n\times n$ matrices $A$ and $B$ to computing the edges involved in negative triangles in a graph. We state this reduction in the following proposition.

Proposition 2.

[39]** Assume that there exists a $T(n)$ -round algorithm for FindEdges. Then there exists a $O(T(n)\log M)$ -round algorithm that computes the distance product of any two $n\times n$ matrices with entries in $\{-M,\ldots,M\}\cup\{-\infty,\infty\}$ .

Sketch of the proof.

Let $D$ be an arbitrary symmetric $n\times n$ matrix with integer entries. Consider the undirected tripartite graph $G=(\mathcal{I}\cup\mathcal{J}\cup\mathcal{K},E,f)$ with $|\mathcal{I}|=|\mathcal{J}|=|\mathcal{K}|=n$ and weight function $f(i,k)=A[i,k]$ for all $(i,k)\in\mathcal{I}\times\mathcal{K}$ , $f(j,k)=A[k,j]$ for all $(j,k)\in\mathcal{J}\times\mathcal{K}$ and $f(i,j)=-D[i,j]$ for all $(i,j)\in\mathcal{I}\times\mathcal{J}$ . Observe that a triple $\{i,j,k\}$ with $i\in\mathcal{I}$ , $j\in\mathcal{J}$ and $k\in\mathcal{K}$ is a negative triangle if and only if

[TABLE]

which implies that the pair $\{i,j\}$ is involved in a negative triangle of $G$ if and only if

[TABLE]

Thus by finding all the pairs $\{i,j\}$ involved in a negative triangle, we learn for which pairs $\{i,j\}$ the above inequality holds. By starting with the all-zero matrix $D$ and doing binary search (adjusting each time each entry of the matrix $D$ ), the distance product can thus be computed by calling $O(\log M)$ times an algorithm for FindEdges. More details can be found in [39]. ∎

From APSP to distance products and proof of Theorem 1.

We now recall how the APSP reduces to the computation of the distance product.111Our explanations focus on computing the lengths of the shortest paths. Using standard techniques (see for instance [4]), the approach can be adapted to return the shortest paths as well, at a cost of increasing the complexity only by a polylogarithmic factor. This is a standard reduction: we refer to, e.g., [41] for a reference in the centralized setting and to [4] for a discussion of the reduction in the CONGEST-CLIQUE model.

Let $G=(V,E,w)$ be a weighted directed graph on $n$ vertices with no self-loop. Assume that the graph has no negative cycle. Let us associate $V$ with the set $[n]$ . The graph can be encoded as an $n\times n$ matrix $A_{G}$ in which

[TABLE]

for each $(i,j)\in[n]\times[n]$ . It is easy to check that the matrix $A_{G}^{n}$ , the $n$ -th power of the matrix $A_{G}$ with respect to the distance product, contains the distances between all pairs of vertices of $G$ . Moreover, this matrix can be computed using only $O(\log n)$ matrix products. If the weights of the graph are integers in $\{-W,\ldots,W\}$ , then all the finite entries of the matrices arising during the computation of $A_{G}^{n}$ are between $-nW$ and $nW$ . We summarize this result in the following proposition.

Proposition 3.

Assume that there exists a $T(n,M)$ -round algorithm that computes the distance product of any two $n\times n$ matrices with entries in $\{-M,\ldots,M\}\cup\{-\infty,\infty\}$ . Then there exists a $O(T(n,nW)\log n)$ -round algorithm for the APSP with integer weights in $\{-W,\ldots,W\}$ .

Theorem 1 then follows from the reductions described in Propositions 1, 2, 3 and from Theorem 2. The success probability of the final quantum algorithm is $1-\tilde{O}((\log W)/n)$ , i.e., with probability $1-\tilde{O}((\log W)/n)$ all the nodes of the network output the correct answer.

4 Distributed multiple quantum searches

In this section we describe our quantum technique: distributed multiple quantum searches only using typical inputs.

4.1 Distributed quantum search

Here, we explain the basic framework for quantum distributed search developed in [27].

Description of the result. Let $X$ be a finite set and $g\colon X\to\{0,1\}$ be a Boolean function over $X$ . Let $u$ be an arbitrary node of the network (e.g., an elected leader). Assume that node $u$ can evaluate the function $g$ in $r$ rounds: assume that there exists an $r$ -round classical distributed algorithm $\mathcal{C}$ such that node $u$ , when receiving as input $x\in X$ , outputs $g(x)$ . Now consider the following problem: node $u$ should find one element $x\in X$ such that $g(x)=1$ (or decide that no such element exists). The trivial strategy is to compute $g(x)$ for each $x\in X$ one by one, which requires $r|X|$ rounds. Le Gall and Magniez [27] showed that there exists a quantum distributed algorithm that solves this problem with high probability in $\tilde{O}(r\sqrt{|X|})$ rounds. While this result is described in [27] for the CONGEST model, it holds for the CONGEST-CLIQUE model as well.

Example. Let us show how the quantum distributed algorithm for the diameter from [27] can be described in this setting. Let $X=V$ be the vertex set of the graph considered. Fix an integer $d$ and define the function $g\colon V\to\{0,1\}$ as follows: for any vertex $v\in V$ we have $g(v)=1$ if and only if the eccentricity of vertex $v$ is larger than $d$ . Solving the problem described in the previous paragraph enables us to decide whether the maximum eccentricity of a vertex (i.e., the diameter of the graph) is larger than $d$ , and repeating this process a logarithmic number of times for different values of $d$ (chosen via binary search) enables us to compute the diameter.222Ref. [27] then observed that in the CONGEST model the function $g$ can be evaluated by computing the eccentricity of the vertex $v$ and sending the information to the node $u$ , which can be done in $O(D)$ rounds, where $D$ denotes the diameter of the graph. Thus the diameter can be computed in $\tilde{O}(\sqrt{n}D)$ rounds. With a few additional improvements it is possible to obtain the better bound $\tilde{O}(\sqrt{nD})$ , see [27].

Technical details. The quantum distributed algorithm for search is obtained by implementing Grover’s well-known quantum search algorithm [17] in the distributed setting. We now explain how this works.

Let us define the two sets $A^{0}=\{x\in X\>|\>g(x)=0\}$ and $A^{1}=X\setminus A^{0}=\{x\in X\>|\>g(x)=1\}$ and assume that $|A^{1}|>0$ . As usual when analyzing Grover’s algorithm, we make the convenient assumption $|A^{1}|<|X|/2$ (otherwise finding a solution is easy). Define the following two quantum states: $|\psi^{0}\rangle=\frac{1}{\sqrt{|A^{0}|}}\sum_{x\in A^{0}}|x\rangle$ and $|\psi^{1}\rangle=\frac{1}{\sqrt{|A^{1}|}}\sum_{x\in A^{1}}|x\rangle,$ which are the uniform superpositions over all the elements in $A^{0}$ and $A^{1}$ , respectively. Let $\mathcal{H}$ denote the subspace generated by these two quantum states. Grover’s algorithm starts from the quantum state $|\Phi_{0}\rangle=\frac{1}{\sqrt{|X|}}\sum_{x\in X}|x\rangle$ corresponding to the uniform superposition over all the elements in $X$ . Note that $|\Phi_{0}\rangle$ belongs to $\mathcal{H}$ and is easy to create. Grover’s algorithm then successively applies the unitary operator corresponding to $\mathcal{C}$ and then a unitary operator $U$ independent of the function $g$ .333Here the term “unitary operator corresponding to $\mathcal{C}$ ” means the unitary operator corresponding to the quantum circuit obtained by converting the classical algorithm $\mathcal{C}$ into a quantum circuit. We typically denote this unitary operator by the same symbol $\mathcal{C}$ , since there is no risk of confusion. The key observation of [27] is that this conversion preserves the complexity: if $\mathcal{C}$ is an $r$ -round classical algorithm then the corresponding unitary operator can be implemented in $O(r)$ rounds.

A crucial property is that any state in $\mathcal{H}$ is mapped by $U\mathcal{C}$ to a state in $\mathcal{H}$ , which means that for any $k\geq 0$ the state of the system after the $k$ -th iteration can be written as $|\Phi_{k}\rangle=(U\mathcal{C})^{k}|\Phi_{0}\rangle=\alpha_{k}|\psi^{0}\rangle+\beta_{k}|\psi^{1}\rangle$ for some complex numbers $\alpha_{k}$ and $\beta_{k}$ such that $|\alpha_{k}|^{2}+|\beta_{k}|^{2}=1$ . The analysis of Grover’s algorithm shows that by choosing $k$ such that $k=O(\sqrt{|X|})$ we can guarantee that $|\beta_{k}|^{2}\approx 1$ , which means that measuring the state $|\Phi_{k}\rangle$ gives an element $x\in A^{1}$ with high probability. The total round complexity is thus $\tilde{O}(r\sqrt{|X|})$ .

Multiple searches. We now describe an easy generalization to multiple searches of the framework presented above. Let $X$ be a finite set and $g_{1},\ldots,g_{m}\colon X\to\{0,1\}$ be $m$ Boolean functions over $X$ , for some integer $m\geq 1$ . For each $i\in[m]$ define the set $A^{1}_{i}=\{x_{i}\in X\>|\>g_{i}(x_{i})=1\}$ and assume for convenience that $|A_{i}^{1}|>0$ .444This can easily be enforced by adding dummy solutions. Even if a dummy solution is introduced, if there is a real solution then Grover’s algorithm will output it after a few repetitions (since when there are more than one solution Grover’s algorithm outputs a random solution). Let $u$ be an arbitrary node of the network. Assume now that node $u$ can evaluate the functions $g_{1},\ldots,g_{m}$ simultaneously in $r$ rounds. More precisely, assume that there exists a $r$ -round classical distributed algorithm $\mathcal{C}_{m}$ such that node $u$ , when receiving as input $(x_{1},\ldots,x_{m})\in X^{m}$ , outputs $(g_{1}(x_{1}),\ldots,g_{m}(x_{m}))$ . Now consider the following problem: node $u$ should find one element in $A_{1}^{1}\times\cdots\times A_{m}^{1}$ . The framework described above can easily be generalized to this problem by having node $u$ implement in parallel $m$ independent distributed quantum searches. This gives a quantum algorithm solving this problem with high probability in $\tilde{O}(r\sqrt{|X|})$ rounds.

4.2 Multiple searches only using typical inputs

We now show a stronger result for the multiple searches problem introduced in the previous subsection: we construct a quantum algorithm that solves the problem even if the evaluation procedure is correct only on inputs close to typical inputs. The motivation for this assumption on the evaluation procedure is as follows. In a typical application (e.g., the example in Section 4.1), the search domain $X$ represents nodes of the network and evaluation should be delegated to these nodes. In this case, at each evaluation step, the node $u$ then needs to send a query to the nodes corresponding to each coordinate of the input $\bm{x}=(x_{1},\ldots,x_{m})\in X^{m}$ . Then if $\bm{x}\in X^{m}$ is mostly dominated by a single $x\in X$ , e.g., $\bm{x}=(x,x,\ldots,x,x)$ , the communication link $(u,x)$ suffers high congestion due to the queries injected by $u$ . In this subsection we show that in some cases such “non-typical” inputs $\bm{x}$ can be completely ignored, which solves this congestion problem.

Let us first introduce the following notation: for any real number $\beta\geq 0$ , let $\Upsilon_{\beta}(m,X)\subseteq X^{m}$ denote the set of all $\bm{x}=(x_{1},\ldots,x_{m})\in X^{m}$ such that for each $x\in X$ its frequency in $\bm{x}$ is at most $\beta$ (i.e., $x$ appears at most $\beta$ times in $\bm{x}$ ). Note that when $\beta>(1+\delta)m/|X|$ for some large enough $\delta>0$ , the set $\Upsilon_{\beta}(m,X)$ includes the “typical” elements of $X^{m}$ , i.e., the elements in which the frequencies of all $x\in X$ are close to the frequencies in an element of $X^{m}$ chosen uniformly at random.

Suppose that instead of assuming the existence of an $r$ -round algorithm $\mathcal{C}_{m}$ that simultaneously evaluates the functions $g_{1},\ldots,g_{m}$ on $X^{m}$ , we only assume that we have an $r$ -round classical distributed algorithm $\tilde{\mathcal{C}}_{m}$ in which node $u$ outputs $(g_{1}(x_{1}),\ldots,g_{m}(x_{m}))$ on an input $(x_{1},\ldots,x_{m})\in\Upsilon_{\beta}(m,X)$ but may output an error message (or an arbitrary output) on an input $(x_{1},\ldots,x_{m})\in X^{m}\setminus\Upsilon_{\beta}(m,X)$ . Our key result is the following theorem, which shows that the same complexity as in Section 4.1 can be achieved in case $\beta$ is large enough so that $\tilde{\mathcal{C}}_{m}$ works correctly both on typical elements from $X^{m}$ and on the solutions of the search problem.

Theorem 3.

Assume that $|X|<m/(36\log m)$ . Assume the existence of an evaluation algorithm $\tilde{\mathcal{C}}_{m}$ , as just described, for some real number $\beta$ such that $\beta>8m/|X|$ . Finally, assume that

[TABLE]

There exists a $\tilde{O}(r\sqrt{|X|})$ -round quantum algorithm that outputs an element of $A_{1}^{1}\times\cdots\times A_{m}^{1}$ with probability at least $1-2/m^{2}$ .

The proof of Theorem 3 can be found in the appendix. The basic idea behind the proof is fairly easy to describe. Let $\mathcal{Q}$ denote the $\tilde{O}(r\sqrt{|X|})$ -round quantum algorithm described in Section 4.1, which uses Algorithm $\mathcal{C}_{m}$ (or more precisely, the quantum operator corresponding to $\mathcal{C}_{m}$ ). The initial state of this algorithm is the uniform superposition over all elements of $X^{m}$ ; since $\beta$ is large enough most of these elements are in $\Upsilon_{\beta}(m,X)$ . The final state is close to the uniform superposition over all elements of $A_{1}^{1}\times\cdots\times A_{m}^{1}$ ; all these elements are also in $\Upsilon_{\beta}(m,X)$ from the assumption. The algorithm of Theorem 3 is exactly the same as $\mathcal{Q}$ but uses (the quantum operator corresponding to) $\tilde{\mathcal{C}}_{m}$ instead of (the quantum operator corresponding to) $\mathcal{C}_{m}$ . Using $\tilde{\mathcal{C}}_{m}$ instead of $\mathcal{C}_{m}$ obviously only has a negligible impact at the beginning of computation and at the end of the computation. The main technical difficulty is to show that this has no significant impact at each step of the computation as well. We show this by proving that at any step of the computation the quantum state of the system is close to its projection on the vector space spanned by the basis vectors that are in $\Upsilon_{\beta}(m,X)$ .

5 Detecting Negative Triangles

In this section we present a $\tilde{O}(n^{1/4})$ -round quantum distributed algorithm that solves the problem FindEdgesWithPromise, which proves Theorem 2. Through the section $G=(V,E,f)$ represents the input of FindEdgesWithPromise, i.e., an undirected weighted graph on $n$ vertices that satisfies the promise $\Gamma(u,v)\leq 90\log n$ for all $\{u,v\}\in S$ .

5.1 Overall description of the algorithm

The description of our algorithm will use two partitions of the vertex set $V$ , which we now introduce. For the ease of presentation we assume that the three numbers $n^{1/4}$ , $\sqrt{n}$ and $n^{3/4}$ are integers (otherwise we can simply round them to the next integers and slightly adjust the sizes of the sets). The first partition is an arbitrary partition of $V$ into $n^{1/4}$ subsets each containing $n^{3/4}$ elements. We denote $\mathcal{V}$ the collection of subsets making this partition. The second partition is an arbitrary partition of $V$ into $\sqrt{n}$ subsets each containing $\sqrt{n}$ elements. We denote $\mathcal{V}^{\prime}$ the collection of subsets making this partition. In addition to the labeling scheme described in Section 2, which labels the nodes of the network by elements of $V$ , our algorithm will also use the following two labeling schemes.

Second labeling scheme.

Let us write $\mathcal{T}=\mathcal{V}\times\mathcal{V}\times\mathcal{V}^{\prime}$ and observe that $|\mathcal{T}|=n$ . We assign one distinct label from $\mathcal{T}$ to each node of the network. We will simply write “node $(\bm{u},\bm{v},\bm{w})$ ” to refer to the node with label $(\bm{u},\bm{v},\bm{w})$ , for each triple $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ . This labeling scheme will be used by the algorithm to decide which node should gather the information of the graph: node $(\bm{u},\bm{v},\bm{w})$ will gather the weights of all the edges $\{u,w\}\in\mathcal{P}(\bm{u},\bm{w})$ and $\{w,v\}\in\mathcal{P}(\bm{w},\bm{v})$ of the graph.

Third labeling scheme.

For each $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ we have $|\mathcal{P}(\bm{u},\bm{v})|=\Theta(n^{3/2})$ . We will describe below a procedure that partitions the set $\mathcal{P}(\bm{u},\bm{v})$ into $\sqrt{n}$ sets each of size $\tilde{\Theta}(n)$ . These sets will be denoted $\Lambda_{x}(\bm{u},\bm{v})$ , for $x\in[\sqrt{n}]$ . Our third labeling scheme assigns one distinct label $(\bm{u},\bm{v},x)\in\mathcal{V}\times\mathcal{V}\times[\sqrt{n}]$ to each node of the network. Again, we will simply write “node $(\bm{u},\bm{v},x)$ ” to refer to the node with label $(\bm{u},\bm{v},x)$ , for each triple $(\bm{u},\bm{v},x)\in\mathcal{V}\times\mathcal{V}\times[\sqrt{n}]$ . This labeling scheme will be used by the algorithm to distribute to search for triangles: node $(\bm{u},\bm{v},x)$ will be in charge of checking the existence of all the triangles involving one edge in the set $\Lambda_{x}(\bm{u},\bm{v})$ .

The partition procedure.

We now describe how to construct the sets $\Lambda_{x}(\bm{u},\bm{v})$ . For technical reasons it will be much more convenient to use a covering instead of a partition of $\mathcal{P}(\bm{u},\bm{v})$ , i.e., to allow some elements to appear more than once, and to construct the covering randomly rather than deterministically.

Consider the following process. Each node $(\bm{u},\bm{v},x)\in\mathcal{V}\times\mathcal{V}\times[\sqrt{n}]$ constructs the set $\Lambda_{x}(\bm{u},\bm{v})\subseteq\mathcal{P}(\bm{u},\bm{v})$ as follows: starting with the empty set, each pair $\{u,v\}\in\mathcal{P}(\bm{u},\bm{v})$ is added by the node to its set $\Lambda_{x}(\bm{u},\bm{v})$ with probability $10\log n/\sqrt{n}$ . We say that the set $\Lambda_{x}(\bm{u},\bm{v})$ is well-balanced if the inequality $\big{|}\{v\in\bm{v}\>|\>\{u,v\}\in\Lambda_{x}(\bm{u},\bm{v})\}\big{|}\leq 100\cdot n^{1/4}\log n$ holds for all $u\in\bm{u}$ . The following lemma, which is proved by standard probabilistic arguments, shows that with high probability the sets created by this process are well-balanced and cover all the set $\mathcal{P}(\bm{u},\bm{v})$ .

Lemma 2.

With probability at least $1-2/n$ the following statements hold for all $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ :

(i)

$\Lambda_{x}(\bm{u},\bm{v})$ * is well-balanced for each $x\in[\sqrt{n}]$ ;*

(ii)

$\bigcup_{x\in[\sqrt{n}]}\Lambda_{x}(\bm{u},\bm{v})=\mathcal{P}(\bm{u},\bm{v})$ .

Proof.

Let us fix $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ .

For any $x\in[\sqrt{n}]$ we have

[TABLE]

for each $u\in\bm{u}$ . Chernoff’s bound and the union bound imply that Condition (i) of the lemma thus holds with probability at least $1-1/n^{2}$ .

Let $\{u,v\}$ be an arbitrary pair in $\mathcal{P}(\bm{u},\bm{v})$ . For any $x\in[\sqrt{n}]$ , this pair is included in $\Lambda_{x}(\bm{u},\bm{v})$ with probability $10\log n/\sqrt{n}$ . The probability that this pair is not included in any $\Lambda_{x}(\bm{u},\bm{v})$ is thus

[TABLE]

Condition (ii) of the lemma thus holds with probability at least $1-1/n^{2}$ .

The statement of the lemma then follows from the above analyses and the union bound. ∎

Description and analysis of the algorithm.

Our algorithm is called ComputePairs and described in Figure 1. Let us analyze it step by step. Step 1 requires $O(n^{1/4})$ rounds, since $|\mathcal{P}(\bm{u},\bm{w})|=|\mathcal{P}(\bm{w},\bm{v})|=O(n^{5/4})$ hold. Step 2 performs the sampling described in Section 5.1, checks which sampled pairs are in $S$ and loads their weight. Step 2 can be implemented in $O(\log n)$ rounds, since communication occurs only when all the sets $\Lambda_{x}(\bm{u},\bm{v})$ are well-balanced. Lemma 2 implies that with probability at least $1-2/n$ the following two statements hold:

(a)

Algorithm ComputePairs does not abort at Step 2;

(b)

at the end of Step 2, each pair $\{u,v\}\in S$ appears at least once at some node.

Note that when the algorithm does not abort, at the end of Step 2 each node $k=(\bm{u},\bm{v},x)$ keeps at most $100n\log n$ pairs (since the sets $\Lambda_{x}(\bm{u},\bm{v})$ are all well-balanced). While the exact number of remaining pairs may naturally depend on the node, in order to simplify the notation we will assume that each node keeps precisely $m=100n\log n$ pairs.

Step 3 of Algorithm ComputePairs can easily be implemented in $O(\sqrt{n})$ rounds in the classical setting. In the next subsections we prove the following statement, which shows that a quadratic speedup can be achieved in the quantum setting.

Proposition 4.

Step 3 of Algorithm ComputePairs can be implemented by a $\tilde{O}(n^{1/4})$ -round quantum algorithm that succeeds with probability at least $1-O(1/n)$ .

Proposition 4 combined with the analysis done in this subsection shows that Algorithm ComputePairs solves the problem FindEdgesWithPromise with probability at least $1-O(1/n)$ (from the union bound). Its overall complexity is $\tilde{O}(n^{1/4})$ . This proves Theorem 2.

Overview of the proof of Proposition 4.

Proposition 4 is proved by applying the methodology of Section 4 to perform simultaneous quantum searches over the search space $\mathcal{V}^{\prime}$ . A crucial point of the analysis is to show how to implement the checking procedure in $\tilde{O}(1)$ rounds. Let us discuss below the main difficulties that need to be overcome.

Consider the problem of checking, for some pair $(u^{k}_{\ell},v^{k}_{\ell})\in\bm{u}\times\bm{v}$ and some fixed $\bm{w}\in\mathcal{V}^{\prime}$ , whether there exists $w\in\bm{w}$ for which $(u^{k}_{\ell},v^{k}_{\ell},w)$ is a negative triangle. This can be done easily as follows: node $(\bm{u},\bm{v},x)$ first sends the pair $(u^{k}_{\ell},v^{k}_{\ell})$ and the weight $f(u^{k}_{\ell},v^{k}_{\ell})$ to node $(\bm{u},\bm{v},\bm{w})$ . Node $(\bm{u},\bm{v},\bm{w})$ then checks whether the inequality

[TABLE]

holds, which can be done locally from the information gathered at Step 1 of Algorithm ComputePairs, and sends back this information to node $(\bm{u},\bm{v},x)$ .

For each $(\bm{u},\bm{v},x)\in\mathcal{V}\times\mathcal{V}\times[\sqrt{n}]$ , node $(\bm{u},\bm{v},x)$ will execute simultaneously $m$ executions of this checking procedure (one for each value of $\ell$ ). Each node $(\bm{u},\bm{v},\bm{w})$ can thus receive, in the worst case, $m\sqrt{n}=\tilde{\Theta}(n^{3/2})$ pairs during one call of the checking procedure, which would require $\tilde{\Theta}(\sqrt{n})$ rounds. To reduce the checking cost to $\tilde{O}(1)$ rounds, as needed, we will partition the set $\mathcal{T}$ into classes and use this partition to balance the load of the checking queries in order to avoid congestions.

The partitioning of $\mathcal{T}$ is described in Section 5.2. It will in particular identify the triples of $\mathcal{T}$ containing many edges from $S$ involved in negative triangles. These triples are the main source for the possible congestions in the checking procedure. A simple, but crucial, observation is that there cannot exist many such triples, since the promise of FindEdgesWithPromise guarantees that the total number of negative triangles in the graph is low. This observation is the key idea on which the implementation of the load balancing is based.

5.2 Implementation of Step 3: Dividing the set $\mathcal{T}$ into classes

Let us first introduce a crucial definition.

Definition 3.

For any $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ , let $\Delta(\bm{u},\bm{v};\bm{w})$ be the following quantity:

[TABLE]

The goal of this subsection is to divide the set of triples $\mathcal{T}$ into classes according to the value of $|\Delta(\bm{u},\bm{v};\bm{w})|$ . Since we do not know how to compute exactly this value efficiently, we actually need to define the classification based on an approximation of $|\Delta(\bm{u},\bm{v};\bm{w})|$ that can be computed efficiently. In Figure 2 we describe a classical algorithm called IdentifyClass that either aborts or assign a nonnegative integer $c_{\bm{u}\bm{v}\bm{w}}$ to each node $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ . Note that the complexity of the algorithm is $O(\log n)$ rounds: Step 1 can obviously be implemented in $20\log n$ rounds and Step 3 does not require any communication. In the case where the algorithm does not abort, we write $\mathcal{T}_{\alpha}$ the set of all triples $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ such that $c_{\bm{u}\bm{v}\bm{w}}=\alpha$ , for each integer $\alpha\geq 0$ . This defines a partition of the set $\mathcal{T}$ . We now show that with high probability the algorithm does not abort and the partition indeed classifies the triples according to the value of $|\Delta(\bm{u},\bm{v};\bm{w})|$ .

Proposition 5.

With probability at least $1-2/n$ , Algorithm IdentifyClass does not abort and the partition $\{\mathcal{T}_{\alpha}\}_{\alpha\geq 0}$ satisfies the following conditions:

(i)

for any $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}_{0}$ , the inequality $|\Delta(\bm{u},\bm{v};\bm{w})|\leq 2n$ holds;

(ii)

for any $\alpha>0$ and any $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}_{\alpha}$ , the inequalities $2^{\alpha-3}n\leq|\Delta(\bm{u},\bm{v};\bm{w})|\leq 2^{\alpha+1}n$ hold.

Proof of Proposition 5.

Let us first compute the probability that the protocol does not abort. For each node $u\in V$ , let $X_{u}$ be the random variable representing the number of neighbors chosen by $u$ , i.e., $X_{u}=|\Lambda(u)|$ . Observe that

[TABLE]

Chernoff’s bound implies the inequality

[TABLE]

The probability that the protocol does not abort is thus $1-1/n$ , from the union bound.

We now consider the probability that Conditions (i) and (ii) hold. Let us consider an arbitrary triple $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ . The expectation of the random variable $\delta_{\bm{u},\bm{v},\bm{w}}$ is

[TABLE]

We divide our analysis into three cases.

•

The case where $|\Delta(\bm{u},\bm{v};\bm{w})|\leq n/6$ . Chernoff’s bound shows that

[TABLE]

Thus $c_{\bm{u}\bm{v}\bm{w}}=0$ with probability at least $1-1/n^{2}$ .

•

The case where $|\Delta(\bm{u},\bm{v};\bm{w})|>n/6$ and $|\Delta(\bm{u},\bm{v};\bm{w})|<2^{c-3}n$ , for some $c\geq 1$ . Chernoff’s bound implies that

[TABLE]

Thus $c_{\bm{u}\bm{v}\bm{w}}\geq c$ with probability at most $1/n^{2}$ .

•

Finally, the case $|\Delta(\bm{u},\bm{v};\bm{w})|>2^{c+1}n$ for some $c\geq 0$ . Chernoff’s bound implies that

[TABLE]

Thus $c_{\bm{u}\bm{v}\bm{w}}\leq c$ with probability at most $1/n^{2}$ .

We conclude that the probability that the outputs of all the nodes $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ satisfy Conditions (i) and (ii) is at least $1-1/n$ , from the union bound.

Finally, the union bound again guarantees that the probability that the protocol does not abort and all the nodes $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}$ satisfy Conditions (i) and (ii) is at least $1-2/n$ . ∎

5.3 Implementation of Step 3: Details and proof of Proposition 4

In this subsection we describe the details of the implementation of Step 3 in Algorithm ComputePairs, which is the only part of the algorithm that uses quantum computation.

The nodes first apply Algorithm IdentifyClass. Proposition 5 guarantees that with probability at least $1-2/n$ Algorithm IdentifyClass does not abort and the partition $\{\mathcal{T}_{\alpha}\}_{\alpha\geq 0}$ satisfies the two conditions of the proposition. In all this subsection we will assume that this happens.

Let us write

[TABLE]

for any $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ and any $\alpha\geq 0$ . We will later use the following two lemmas that are direct consequences of the bounds given in Proposition 5.

Lemma 3.

With probability at least $1-1/n^{2}$ the inequality

[TABLE]

holds for all $(\bm{u},\bm{v},x)\in\mathcal{V}\times\mathcal{V}\times[\sqrt{n}]$ , all $\alpha\geq 0$ and all $\bm{w}\in\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ .

Proof.

Let us fix $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ . Consider any $x\in[\sqrt{n}]$ , any $\alpha\geq 0$ and any $\bm{w}\in\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ . Observe that

[TABLE]

Chernoff’s bound implies that the inequality $|\Lambda_{x}(\bm{u},\bm{v})\cap\Delta(\bm{u},\bm{v};\bm{w})|\leq 100\cdot 2^{\alpha}\sqrt{n}\log n$ holds with probability at most $1/n^{5}$ . The statement of the lemma then follows from the union bound. ∎

Lemma 4.

The following inequality holds for all $\alpha\geq 0$ and all $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ :

[TABLE]

Proof.

This is obviously true for $\alpha=0$ . Let us now consider any $\alpha>0$ and any $(\bm{u},\bm{v})\in\mathcal{V}\times\mathcal{V}$ . Remember that we are assuming that $\Gamma(u,v)\leq 90\log n$ for all pairs $\{u,v\}\in S$ . We thus have

[TABLE]

Combining this upper bound with the lower bound of Statement (ii) of Proposition 5 gives the claimed upper bound on $|\mathcal{T}_{\alpha}[\bm{u},\bm{v}]|$ . ∎

To implement Step 3 of Algorithm ComputePairs, the strategy is to consider each $\alpha$ separately and perform simultaneous quantum searches over $\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ , as outlined in Figure 3. We first describe in Section 5.3.1 how to implement these quantum searches in $\tilde{O}(n^{1/4})$ rounds for the case $\alpha=0$ , and then in Section 5.3.1 how to achieve the same complexity for the case $\alpha>0$ .

5.3.1 Analysis of Step 3.2 for $\bm{\alpha=0}$

In Step 3.2 each node $k=(\bm{u},\bm{v},x)$ executes $m$ simultaneous quantum searches. In order to describe this process using the framework presented in Section 4, with $X=\mathcal{T}_{0}[\bm{u},\bm{v}]$ and $m=100n\log n$ , we need to explain the evaluation procedure. Since $\mathcal{T}_{0}[\bm{u},\bm{v}]\subseteq\mathcal{V}^{\prime}$ , we have $|\mathcal{T}_{0}[\bm{u},\bm{v}]|\leq\sqrt{n}$ . For simplicity (but without loss of generality) we assume below that $|\mathcal{T}_{0}[\bm{u},\bm{v}]|=\sqrt{n}$ . Observe that the evaluation procedure should implement the following test: each node $k$ , when evaluating a list $(\bm{w}^{k}_{1},\ldots,\bm{w}^{k}_{m})$ of $m$ elements in $\mathcal{T}_{0}[\bm{u},\bm{v}]$ , should check for each $\ell\in[m]$ whether there exists a vertex $w\in\bm{w}^{k}_{\ell}$ such that $\{u^{k}_{\ell},v^{k}_{\ell},w\}$ is a negative triangle.

Let $L^{k}_{\bm{w}}\subseteq\mathcal{P}(\bm{u},\bm{v})$ denote the list consisting of all the pairs $\{u_{i}^{k},v_{i}^{k}\}$ such that $\bm{w}^{k}_{i}=\bm{w}$ , for each node $k=(\bm{u},\bm{v},x)$ and each $\bm{w}\in\mathcal{T}_{0}[\bm{u},\bm{v}]$ . We make the assumption that $|L^{k}_{\bm{w}}|\leq 800\sqrt{n}\log n$ for all $k=(\bm{u},\bm{v},x)$ and all $\bm{w}\in\mathcal{T}_{0}[\bm{u},\bm{v}]$ and describe an evaluation procedure that works under this assumption. The procedure is described in Figure 4.

The procedure of Figure 4 obviously always outputs the correct answers since for each $k$ and each $\ell\in[m]$ , Inequality (2) at Step 2 precisely checks if there exists some $w\in\bm{w}_{\ell}^{k}$ such that $\{u_{\ell}^{k},v_{\ell}^{k},w\}$ is a negative triangle (because the pair $\{u_{\ell}^{k},v_{\ell}^{k}\}$ is sent to node $(\bm{u},\bm{v},\bm{w}_{\ell}^{k})$ ). We now analyze its complexity. Since each list $L^{k}_{\bm{w}}$ contains at most $800\sqrt{n}\log n$ elements, at Step 1 each node $k=(\bm{u},\bm{v},x)$ sends at most $800\sqrt{n}\log n$ elements to $(\bm{u},\bm{v},\bm{w})$ for each $\bm{w}\in\mathcal{T}_{0}[\bm{u},\bm{v}]$ . Conversely, each node $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}_{0}$ receives at most $800\sqrt{n}\log n$ elements from $(\bm{u},\bm{v},x)$ for each $x\in[\sqrt{n}]$ . Thus, in the CONGEST-CLIQUE model, Step 1 can be implemented in $O(\log n)$ rounds. Testing whether Inequality (2) holds or not at Step 2 can be done locally using the information collected at Step 1 of Algorithm ComputePairs. Sending back the information at Step 2 can be done with the same complexity as in Step 1. The complexity of the checking procedure is thus $O(\log n)$ rounds.

We can apply Theorem 3 with $X=\mathcal{T}_{0}[\bm{u},\bm{v}]$ and $\beta=800\sqrt{n}\log n$ . Lemma 3 guarantees that with probability at least $1-1/n^{2}$ , the assumptions in the statement of Theorem 3 are satisfied. Theorem 3 then implies that for $\alpha=0$ the quantum searches of Step 3.2 of Algorithm ComputePairs can be implemented in $\tilde{O}(n^{1/4})$ rounds and succeed with probability at least $1-2/m^{2}$ .

5.3.2 Analysis of Step 3.2 for $\bm{\alpha>0}$

The analysis of the complexity of the approach presented in Section 5.3.1 crucially relied on the inequality from Lemma 3, which guarantees that $|\Lambda_{x}(\bm{u},\bm{v})\cap\Delta(\bm{u},\bm{v};\bm{w})|\leq 100\cdot\sqrt{n}\log n$ . For $\alpha>0$ and $(\bm{u},\bm{v},\bm{w})\in\mathcal{T}_{\alpha}$ , Lemma 3 only gives the weaker upper bound

[TABLE]

The upper bound from Lemma 4 is the key observation that will make possible to solve this technical issue.

In Section 5.3.1 each node $(\bm{u},\bm{v},x)$ communicated with node $(\bm{u},\bm{v},\bm{w})$ for each $\bm{w}\in\mathcal{T}_{0}[\bm{u},\bm{v}]$ . We used the upper bound $\mathcal{T}_{0}[\bm{u},\bm{v}]\leq\sqrt{n}$ in the analysis. In the case $\alpha>0$ we can use the better upper bound from Lemma 4, which reduces the number of destination nodes by (roughly) a factor $2^{\alpha}$ . In consequence, we can increase the bandwidth towards these destinations nodes by (roughly) a factor $2^{\alpha}$ . (This can be done by duplicating the information owned by the destination nodes.) We will show that this is enough to counterbalance the increase by a factor $2^{\alpha}$ of the message size due to Inequality (3).

We now give more details about the idea of duplicating information to increase the bandwidth. We introduce a new labeling scheme. For the ease of presentation let us assume that $2^{\alpha}/(720\log n)$ is an integer (if this is not the case the scheme just need to be slightly adapted). In the new scheme each node is assigned a distinct label in $(\bm{u},\bm{v},\bm{w},y)\in\mathcal{T}_{\alpha}\times[2^{\alpha}/(720\log n)]$ . Lemma 4 ensures that this can be done.

Similarly to Section 5.3.1, let $L^{k}_{\bm{w}}\subseteq\mathcal{P}(\bm{u},\bm{v})$ denote the list consisting of all the pairs $\{u_{i}^{k},v_{i}^{k}\}$ such that $\bm{w}^{k}_{i}=\bm{w}$ , for each node $k=(\bm{u},\bm{v},x)$ and each $\bm{w}\in\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ . We make the assumption that $|L^{k}_{\bm{w}}|\leq 800\cdot 2^{\alpha}\sqrt{n}\log n$ for all $\bm{w}\in\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ and describe an evaluation procedure that works under this assumption. The procedure is described in Figure 5. The main difference with the procedure in Section 5.3.1 is that instead of sending the whole list we divide it in sublists and send the sublist $L^{k}_{\bm{w},y}$ to $(\bm{u},\bm{v},\bm{w},y)$ for each $y\in[2^{\alpha}/(720\log n)]$ . Another difference is Step 0: each node $(\bm{u},\bm{v},\bm{w})$ first duplicates its input by broadcasting it to all the nodes $(\bm{u},\bm{v},\bm{w},y)$ , which can be done in $O(n^{1/4})$ rounds using a randomized routing scheme.

We now analyze the complexity of Steps 1 and 2 of the evaluation procedure. Since each list $L^{k}_{\bm{w},y}$ contains at most $O(\sqrt{n}(\log n)^{2})$ elements, at Step 1 each node $k=(\bm{u},\bm{v},x)$ sends a list containing $O(\sqrt{n}(\log n)^{2})$ elements to

[TABLE]

nodes (here we used Lemma (4)). Conversely, each node $(\bm{u},\bm{v},\bm{w},y)\in\mathcal{T}_{\alpha}\times[2^{\alpha}/(720\log n)]$ receives $O(\sqrt{n}(\log n)^{2})$ elements from $(\bm{u},\bm{v},x)$ for each $x\in[\sqrt{n}]$ . Thus, in the CONGEST-CLIQUE model, Step 1 can be implemented in $O((\log n)^{2})$ rounds. Testing whether Inequality (2) holds or not at Step 2 can be done locally using the information collected at Step 1. Sending back the information at the end of Step 2 can be done with the same complexity as in Step 1. The complexity of the checking procedure is thus $O((\log n)^{2})$ rounds.

We can apply Theorem 3 with $X=\mathcal{T}_{\alpha}[\bm{u},\bm{v}]$ and $\beta=800\cdot 2^{\alpha}\sqrt{n}\log n$ . Lemma 3 guarantees that with probability at least $1-1/n^{2}$ , the assumptions in the statement of Theorem 3 are satisfied. Theorem 3 then implies that for $\alpha>0$ as well the quantum searches of Step 4 of Algorithm ComputePairs can be implemented in $\tilde{O}(n^{1/4})$ rounds and succeed with probability at least $1-2/m^{2}$ .

Acknowledgements

TI was partially supported by JST SICORP and JSPS KAKENHI grants No. 16H02878 and No. 19K11824. FLG was partially supported by JSPS KAKENHI grants No. 15H01677, No. 16H01705, No. 16H05853 and No. 19H04066.

Appendix A Distributed multiple quantum searches

In this appendix we prove Theorem 3.

Let $\mathcal{Q}$ denote the $\tilde{O}(r\sqrt{|X|})$ -round quantum algorithm described at the end of Section 4.1. Remember that this algorithm implements in parallel $m$ independent executions of Grover’s algorithm and uses Algorithm $\mathcal{C}_{m}$ as a global evaluation procedure. We first analyze this algorithm in more details. For any string $b\in\{0,1\}^{m}$ let us define the quantum state

[TABLE]

where

[TABLE]

for each $i\in[m]$ . Let $\mathcal{H}_{m}$ denote the Hilbert space spanned by all the quantum states in the set $\{|\psi^{b}\rangle\}_{b\in\{0,1\}^{n}}$ . An important observation is that Algorithm $\mathcal{Q}$ leaves the space $\mathcal{H}_{m}$ invariant. The initial state of Algorithm $\mathcal{Q}$ is

[TABLE]

which is in $\mathcal{H}_{m}$ . For each $k\geq 0$ , one step of the algorithm maps the state $|\Phi^{m}_{k}\rangle$ to the state

[TABLE]

where $U_{m}$ is a unitary operator independent of the function $g_{1},\ldots,g_{n}$ and $\mathcal{C}_{m}$ represents the unitary operator corresponding to the quantum circuit obtained by converting the classical algorithm $\mathcal{C}_{m}$ into a quantum circuit. Analyzing Grover’s algorithm shows that after $k=O(\sqrt{|X|})$ iterations the quantum state $|\Phi_{k}^{m}\rangle$ becomes close to the state $|\psi_{1}^{1}\rangle\otimes\cdots\otimes|\psi_{m}^{1}\rangle$ , and thus measuring this state gives an element from $A_{1}^{1}\times\cdots\times A_{m}^{1}$ with high probability. This success probability can be amplified to (for instance) $1-1/m^{2}$ by repeating the algorithm a logarithmic number of time.

Let $\tilde{\mathcal{Q}}$ be exactly the same algorithm as $\mathcal{Q}$ but with each application of the quantum circuit corresponding to $\mathcal{C}_{m}$ replaced by an application of the quantum circuit corresponding to $\tilde{\mathcal{C}}_{m}$ . Let us analyze the output of $\tilde{\mathcal{Q}}$ . As in Section 4.1, we make the assumption $|A^{1}_{i}|\leq|X|/2$ , for all $i\in[m]$ . Let $\mathcal{H}^{\prime}_{m}$ denote the Hilbert space spanned by all vectors $|x_{1}\rangle\otimes\cdots\otimes|x_{m}\rangle$ with $(x_{1},\ldots,x_{m})\in\Upsilon_{\beta}(m,X)$ , and $\mathcal{H}^{\prime\prime}_{m}$ denote the Hilbert space spanned by all $|x_{1}\rangle\otimes\cdots\otimes|x_{m}\rangle$ with $(x_{1},\ldots,x_{m})\in X^{m}\setminus\Upsilon_{\beta}(m,X)$ . Let $\Pi_{m}$ denote the projection into $\mathcal{H}^{\prime\prime}_{m}$ . We first show the following crucial lemma.

Lemma 5.

Assume that $\beta>8m/|X|$ and $A_{1}^{1}\times\cdots\times A_{m}^{1}\subseteq\Upsilon_{\beta/2}(m,X)$ . For any quantum state $|\varphi\rangle\in\mathcal{H}_{m}$ we have

[TABLE]

Proof.

The state $|\varphi\rangle$ can be written as

[TABLE]

for some amplitude $\alpha_{b}\in\mathbb{C}$ such that $\sum_{b\in\{0,1\}^{m}}|\alpha_{b}|^{2}=1$ . Observe that

[TABLE]

since all the vectors $\Pi_{m}|\psi^{b}\rangle$ are orthogonal. We show below that the inequality

[TABLE]

holds for any $b\in\{0,1\}^{m}$ . The claimed upper bound on $\|\Pi_{m}|\varphi\rangle\|$ then immediately follows.

Consider a string $b\in\{0,1\}^{m}$ . Let us assume, without loss of generality, that $b$ is the string with [math]s in the first $\ell$ positions, followed by $1$ s in the next $m-\ell$ positions, for some integer $\ell\in\{0,1,\ldots,m\}$ . The state $|\psi^{b}\rangle$ is thus the uniform superposition of all the states $|x_{1}\rangle\otimes\cdots\otimes|x_{m}\rangle$ for all $(x_{1},\ldots,x_{\ell},x_{\ell+1},\ldots,x_{m})\in A^{0}_{1}\times\cdots\times A^{0}_{\ell}\times A^{1}_{\ell+1}\times\cdots\times A^{1}_{m}$ . For any choice of $(x_{\ell+1},\ldots,x_{m})\in A^{1}_{\ell+1}\times\cdots\times A^{1}_{m}$ , we claim that the fraction of $(x_{1},\ldots,x_{\ell})\in A^{0}_{1}\times\cdots\times A^{0}_{\ell}$ such that $(x_{1},\ldots,x_{m})\notin\Upsilon_{\beta}(m,M)$ is at most

[TABLE]

This immediately implies Inequality (4).

Let us prove the claim. Remember that we are assuming $A_{1}^{1}\times\cdots\times A_{m}^{1}\subseteq\Upsilon_{\beta/2}(m,X)$ . For any $x\in X$ , we thus know that there are at most $\beta/2$ indices $i\in\{\ell+1,\ldots,m\}$ such that $x_{i}=x$ . For each $i\in\{1,\ldots,\ell\}$ , the probability that an element taken uniformly at random from $A_{i}^{0}$ equals $x$ is at most $1/|A_{i}^{0}|\leq 2/|X|$ . When $(x_{1},\ldots,x_{\ell})$ is chosen uniformly at random in $A_{1}^{0}\times\cdots\times A_{\ell}^{0}$ , the expected number of times $x$ appears is thus at most

[TABLE]

where we used the assumption $\beta>8m/|X|$ for the last inequality. Chernoff’s bound implies that the probability that $x$ appears more than $\beta$ times is at most

[TABLE]

and the claim then follows from the union bound. ∎

We can now analyze the output of Algorithm $\tilde{\mathcal{Q}}$ and prove Theorem 3.

Proof of Theorem 3.

Let $|\tilde{\Phi}_{k}^{m}\rangle$ denote the state at the $k$ -th iteration when executing Algorithm $\tilde{\mathcal{Q}}$ . Initially we have $|\tilde{\Phi}_{0}^{m}\rangle=|{\Phi}_{0}^{m}\rangle$ . For any $k\geq 0$ let us write

[TABLE]

where $|{\Phi}^{\prime}_{k}\rangle$ and $|{\Phi}^{\prime\prime}_{k}\rangle$ are the projections of $|{\Phi}_{k}^{m}\rangle$ into $\mathcal{H}_{m}^{\prime}$ and $\mathcal{H}_{m}^{\prime\prime}$ , respectively, and $|{\tilde{\Phi}}^{\prime}_{k}\rangle$ and $|{\tilde{\Phi}}^{\prime\prime}_{k}\rangle$ are the projections of $|{\tilde{\Phi}}_{k}^{m}\rangle$ into $\mathcal{H}_{m}^{\prime}$ and $\mathcal{H}_{m}^{\prime\prime}$ , respectively. Note that $\mathcal{C}_{m}|\Phi_{k}^{\prime}\rangle=\tilde{\mathcal{C}}_{m}|\Phi_{k}^{\prime}\rangle$ for all $k\geq 0$ .

We have

[TABLE]

where we used Lemma 5 to obtain the last inequality. We conclude that for any $k\geq 0$ we have

[TABLE]

where we used the assumption $|X|<m/(36\log m)$ to derive the last inequality. This implies that the output of Algorithm $\tilde{\mathcal{Q}}$ is the same as the output of Algorithm $\mathcal{Q}$ with probability at least $1-1/m^{2}$ . The output of $\tilde{\mathcal{Q}}$ is thus correct with probability at least $1-2/m^{2}$ , from the union bound. ∎

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Ruben Becker, Andreas Karrenbauer, Sebastian Krinninger, and Christoph Lenzen. Near-optimal approximate shortest paths and transshipment in distributed and streaming models. In Proceedings of the International Symposium on Distributed Computing (DISC) , pages 7:1–7:16, 2017.
2[2] Aaron Bernstein and Danupon Nanongkai. Distributed exact weighted all-pairs shortest paths in near-linear time. In Proceedings of the 51st ACM Symposium on Theory of Computing , 2019 (to appear). Ar Xiv:1811.03337.
3[3] Anne Broadbent and Alain Tapp. Can quantum mechanics help distributed computing? SIGACT News , 39(3):67–76, 2008.
4[4] Keren Censor-Hillel, Petteri Kaski, Janne H. Korhonen, Christoph Lenzen, Ami Paz, and Jukka Suomela. Algebraic methods in the congested clique. Distributed Computing , March 2016.
5[5] Yi-Jun Chang, Seth Pettie, and Hengjie Zhang. Distributed triangle detection via expander decomposition. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 821–840, 2019.
6[6] Yi-Jun Chang and Thatchaphol Saranurak. Improved distributed expander decomposition and nearly optimal triangle enumeration. Ar Xiv:1904.08037, April 2019.
7[7] Vasil S. Denchev and Gopal Pandurangan. Distributed quantum computing: a new frontier in distributed systems or science fiction? SIGACT News , 39(3):77–95, 2008.
8[8] Danny Dolev, Christoph Lenzen, and Shir Peled. “Tri, Tri Again”: Finding triangles and small subgraphs in a distributed setting - (extended abstract). In Proceedings of the International Symposium on Distributed Computing (DISC) , pages 195–209, 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Quantum Distributed Algorithm for the All-Pairs Shortest Path Problem in the CONGEST-CLIQUE Model

Abstract

1 Introduction

Theorem 1**.**

2 Preliminaries

General notations.

Quantum CONGEST-CLIQUE model.

Lemma 1**.**

Graph-theoretic problems in the CONGEST-CLIQUE model.

3 APSP and negative triangles

Finding the edges in negative triangles.

Definition 1**.**

Proposition 1**.**

Proof.

Theorem 2**.**

From distance products to negative triangles.

Definition 2**.**

Proposition 2**.**

Sketch of the proof.

From APSP to distance products and proof of Theorem 1.

Proposition 3**.**

4 Distributed multiple quantum searches

4.1 Distributed quantum search

4.2 Multiple searches only using typical inputs

Theorem 3**.**

5 Detecting Negative Triangles

5.1 Overall description of the algorithm

Second labeling scheme.

Third labeling scheme.

The partition procedure.

Lemma 2**.**

Proof.

Description and analysis of the algorithm.

Proposition 4**.**

Overview of the proof of Proposition 4.

5.2 Implementation of Step 3: Dividing the set T\mathcal{T}T into classes

Definition 3**.**

Proposition 5**.**

Proof of Proposition 5.

5.3 Implementation of Step 3: Details and proof of Proposition 4

Lemma 3**.**

Proof.

Lemma 4**.**

Proof.

5.3.1 Analysis of Step 3.2 for α=0\bm{\alpha=0}α=0

5.3.2 Analysis of Step 3.2 for α>0\bm{\alpha>0}α>0

Acknowledgements

Appendix A Distributed multiple quantum searches

Lemma 5**.**

Proof.

Proof of Theorem 3.

Theorem 1.

Lemma 1.

Definition 1.

Proposition 1.

Theorem 2.

Definition 2.

Proposition 2.

Proposition 3.

Theorem 3.

Lemma 2.

Proposition 4.

5.2 Implementation of Step 3: Dividing the set $\mathcal{T}$ into classes

Definition 3.

Proposition 5.

Lemma 3.

Lemma 4.

5.3.1 Analysis of Step 3.2 for $\bm{\alpha=0}$

5.3.2 Analysis of Step 3.2 for $\bm{\alpha>0}$

Lemma 5.