Connectivity Lower Bounds in Broadcast Congested Clique
Shreyas Pai, Sriram V. Pemmaraju

TL;DR
This paper establishes new lower bounds of (log n) rounds for solving graph connectivity in the broadcast congested clique model, using combinatorial, reduction, and information-theoretic techniques.
Contribution
It introduces three novel lower bounds for connectivity in BCC(1), extending known results to randomized and deterministic algorithms and different knowledge models.
Findings
Lower bound of (log n) rounds for KT-0 BCC(1) model.
Lower bound extends to KT-1 deterministic algorithms.
Lower bound applies to constant-error Monte Carlo algorithms for connected components.
Abstract
We prove three new lower bounds for graph connectivity in the -bit broadcast congested clique model, BCC. First, in the KT- version of BCC, in which nodes are aware of neighbors only through port numbers, we show an round lower bound for CONNECTIVITY even for constant-error randomized Monte Carlo algorithms. The deterministic version of this result can be obtained via the well-known "edge-crossing" argument, but, the randomized version of this result requires establishing new combinatorial results regarding the indistinguishability graph induced by inputs. In our second result, we show that the lower bound result extends to the KT- version of the BCC model, in which nodes are aware of IDs of all neighbors, though our proof works only for deterministic algorithms. Since nodes know IDs of their neighbors in the KT- model, it is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Connectivity Lower Bounds in Broadcast Congested Clique††thanks: A short version of this
paper has appeared as a brief announcement in PODC 2019.
Shreyas Pai
The University of Iowa
Sriram V. Pemmaraju
The University of Iowa
Abstract
We prove three new lower bounds for graph connectivity in the -bit broadcast congested clique model, BCC. First, in the KT-[math] version of BCC, in which nodes are aware of neighbors only through port numbers, we show an round lower bound for Connectivity even for constant-error randomized Monte Carlo algorithms. The deterministic version of this result can be obtained via the well-known “edge-crossing” argument, but, the randomized version of this result requires establishing new combinatorial results regarding the indistinguishability graph induced by inputs. In our second result, we show that the lower bound result extends to the KT- version of the BCC model, in which nodes are aware of IDs of all neighbors, though our proof works only for deterministic algorithms. Since nodes know IDs of their neighbors in the KT- model, it is no longer possible to play “edge-crossing” tricks; instead we present a reduction from the 2-party communication complexity problem Partition in which Alice and Bob are given two set partitions on and are required to determine if the join of these two set partitions equals the trivial one-part set partition. While our KT- Connectivity lower bound holds only for deterministic algorithms, in our third result we extend this KT-1 lower bound to constant-error Monte Carlo algorithms for the closely related ConnectedComponents problem. We use information-theoretic techniques to obtain this result. All our results hold for the seemingly easy special case of Connectivity in which an algorithm has to distinguish an instance with one cycle from an instance with multiple cycles. Our results showcase three rather different lower bound techniques and lay the groundwork for further improvements in lower bounds for Connectivity in the BCC model.
1 Introduction
We are given an -node, completely connected communication network in which each node can broadcast at most bits in each round. These nodes and a subset of the edges of the communication network form the input graph. The question we ask is this: how many rounds of communication does it take to determine if the input graph is connected? This is the well known Connectivity problem in the -bit Broadcast Congested Clique, i.e., the BCC model.
A series of recent rapid improvements [Heg+15, GP16, JN18] have shown that Connectivity and in fact MST, can be solved in rounds w.h.p.111We use “w.h.p.” as short for “with high probability” which refers to the probability that is at least for . in the -bit Congested Clique model, CC, when . The CC model allows each node to send a possibly different -bit message to each of the other nodes in the network, in each round. In contrast, the fastest known algorithm for Connectivity in the BCC model, due to Jurdziński and Nowicki [JN17], is deterministic and it runs in rounds. This contrast between BCC and CC is not surprising, given how much larger the overall bandwidth in CC is compared to BCC. Becker et al. [Bec+16] show that the pair-wise set disjointness problem can be solved in rounds in CC, but needs rounds in BCC. But, despite the fact that Connectivity is such a fundamental problem, no non-trivial lower bound is known for Connectivity in BCC. In fact, prior to this paper, we could not even rule out an -round Connectivity algorithm in BCC.
Lower bound arguments in “congested” distributed computing models typically use a “bottleneck” technique [CKP17, CK18, DS+11, DKO14, Fis+18, HP15]. At a high level, this technique consists of showing that there is a low bandwidth cut in the communication network across which a high volume of information has to flow in order to solve the given problem. The lower bound on information flow is usually obtained via 2-party communication complexity lower bounds [KN97]. Not surprisingly, the “bottleneck” technique does not work in the CC model because any cut with vertices in each part, has a high bandwidth of bits. In fact, a result of Drucker et al. [DKO14], showing that circuits can be simulated efficiently in the Congested Clique model, indicates that no technique we currently know of can prove non-trivial lower bounds in the CC model. However, as further shown by [DKO14], “bottlenecks” are possible for some problems in the weaker BCC model. In this model, every cut has bandwidth and for example Drucker et al. [DKO14] provide a reduction showing that for the problem of detecting the presence of a in the input graph there is a cut across which information has to flow. This leads to an lower bound for -detection in the BCC.
All known lower bounds [DKO14, HP15] in the BCC model have this general structure and these techniques work for problems such as fixed subgraph detection, all pairs shortest paths, diameter computation, etc., that are relatively difficult, requiring polynomially many rounds to solve. For “simpler” problems such as Connectivity and MST, we need more fine-grained lower bound techniques that allow us to prove polylogarithmic lower bounds. Specifically, since Connectivity can be solved in BCC for any in just rounds, the best we can expect is to show the existence of a cut across which volume of information needs to flow. In fact, the connected components of a subgraph can be represented in bits and this is all that needs to communicated across a cut to solve Connectivity. Thus the best lower bound we can expect for Connectivity via this technique is an . However, even this was unknown prior to this paper and one contribution of this paper is an lower bound for Connectivity using the “bottleneck” technique.
1.1 Our Contribution
We consider the Connectivity problem and the closely related ConnectedComponents problem in the BCC model. In the latter problem, each node needs to output the label of the connected component it belongs to. We work in the BCC model because it allows us to isolate barriers due to different levels of initial local knowledge (e.g., knowing IDs of neighbors vs not knowing IDs). This is also without loss of generality because a -round lower bound in BCC immediately translates to a -round lower bound in BCC. We consider two natural versions of the BCC model, that we call KT-0 and KT-1 (using notation from [Awe+90]). In the KT-0 (“Knowledge Till 0 hops”) version, nodes are unaware of IDs of other nodes in the network and the communication ports at each node are arbitrarily numbered 1 through . In the KT-1 (“Knowledge Till 1 hop”) version, nodes know all IDs in the network and the communication ports at each node are respectively labeled with the IDs of the nodes at the other end of the port. Note that if the bandwidth , then there is essentially no distinction between the KT-0 and KT-1 versions since each node in the KT-0 version can send its ID to neighbors in constant rounds and then nodes would have as much knowledge as they initially do in the KT-1 version. But the difference in initial knowledge plays a critical role when and in fact our best results in these two models use completely different techniques. We present three main lower bound results in this paper, derived using very different techniques.
- •
In the KT-0 version of BCC we show an round lower bound for Connectivity even for constant-error randomized Monte Carlo algorithms. In fact, the lower bound is shown for the seemingly simpler “one cycle vs two cycles” problem in which the input graph is either a single cycle or consists of two disjoint cycles and the algorithm has to distinguish between these two possibilities. We use a well-known indistinguishability argument involving “edge crossing” [KKP10, BFP15, PP17] for this result, but the main novelty here is how this argument deals with the possibility that the algorithm can err on a constant fraction of the input instances. In a standard edge crossing argument one shows that for a particular YES instance (i.e., a connected or “one-cycle” instance) , many of the NO instances obtained by crossing pairs of edges and in cannot be distinguished even after some rounds of a BCC algorithm (see Definition 3.3 for the precise definition of a crossing). But for a randomized lower bound in BCC, it is not enough to consider a single YES instance. Instead, we use the bipartite indistinguishability graph induced by all YES and NO instances and show that this satisfies a polygamous version of Hall’s Theorem (see Theorem 2.1). This allows us to show the existence of a large generalized matching in the indistinguishability graph, which in turn shows that every round constant-error Monte Carlo algorithm can be fooled into making more errors than it is allowed.
- •
We then show that the above lower bound result extends to the KT-1 version of the BCC model, though our proof only works for deterministic algorithms. In KT-1, because of knowledge of IDs of neighbors, it is no longer possible to perform “edge crossing” tricks. But we are able to successfully use the “bottleneck” technique and show that there is a cut for the Connectivity problem across which bits need to flow. We prove this result by presenting a reduction from the 2-party communication complexity problem Partition [HMT88]. In the Partition problem, we have a ground set and Alice and Bob respectively are given two set partitions and of . The goal is to output 1 iff where (read as “ join ”) is the finest partition such that both and are refinements of 222Given two set partitions and of , is said to be a refinement of if for every part , there is a part such that . For example the partition is a refinement of . and is the trivial partition consisting of the single set . For example, if , , and then and . We then use the fact that the deterministic communication complexity of Partition is to obtain our result. Again, this time using a linear-algebraic argument, we show our result for a seemingly simple special case of Connectivity: “one cycle vs multiple cycles.” As far as we know, randomized communication complexity of Partition is a long-standing unresolved problem. Showing a lower bound on the randomized communication complexity of Partition will immediately lead to a KT-1 lower bound for randomized Connectivity algorithms, via our reduction.
- •
Our final result arises from our attempt to obtain a KT-1 lower bound even for constant-error Monte Carlo algorithms. We consider a version of the Partition problem, called PartitionComp, in which Alice and Bob are required to output the join of their respective input partitions and instead of just determining if . We use an information-theoretic argument to show that the mutual information of any algorithm, even a constant-error Monte Carlo algorithm, that solves this version of Partition is . This leads to an -round lower bound for ConnectedComponents in the KT-1 version of BCC, even for constant-error randomized Monte Carlo algorithms.
We prove in this paper the first non-trivial lower bounds for Connectivity in the BCC model. The fact that our lower bounds hold even in the KT-1 model implies that the difficulty of the problem does not arise just from lack of knowledge of IDs of other nodes. The fact that our lower bounds hold for extremely sparse (i.e., 2-regular) graphs, suggests that there might be room to get stronger lower bounds by considering dense input graphs. In fact, using a deterministic sketching technique [MT16a, MT16], it is possible to obtain a deterministic -round BCC(1) algorithm for Connectivity for graphs with arboricity bounded by a constant. This implies that our lower bounds are tight for uniformly sparse graphs.
1.2 The BCC Model
A size- KT-0* instance* of the BCC model consists of vertices, each with a unique -bit ID. Each vertex has communication ports labeled distinctly, 1 through , in an arbitrary manner. A key feature of the KT-0 instance is that port labels have nothing to do with IDs. Pairs of communication ports are connected by network edges such that the underlying communication network is a clique. The vertices along with a subset of the edges form the input graph. Thus some edges are both network edges and input graph edges, whereas the remaining edges are just network edges. The initial knowledge of a vertex consists of its ID, its port numbering, an identification of ports that correspond to input edges, and an arbitrarily long string of random bits. In each round , each vertex receives messages via broadcast from the remaining vertices in the previous round, performs local computation, and broadcasts a message of length at most -bits. This message is received at the beginning of round by the remaining vertices along each of their communication ports that connect to . After rounds, the at most bits that sends and the at most bits that receives, along with the ports that they are received from make up the transcript of at round . A size- KT-1* instance* of the BCC model differs from a KT-0 instance in one important way: each network edge is connected to at port number and connected to at port number . Thus, in a KT-1 instance, IDs serve as port numbers and the initial knowledge of a vertex consists include all vertex IDs.
Since the main focus of the paper is to derive lower bounds, we assume the public coin model in which all the random strings are identical. Lower bounds proved in the public coin model hold in the private coin model as well, in which all the ’s are distinct. For a decision problem, such as Connectivity, when we run a BCC algorithm on an input graph , each vertex outputs either YES or NO and the output of the system is YES if all vertices output YES and is NO otherwise. For a deterministic algorithm for Connectivity the system must output YES if is connected and NO if is disconnected. If is an -error randomized Monte Carlo algorithm, then in order to be correct, it must satisfy the following requirements: (i) if is connected then the system outputs YES with probability and (ii) if is disconnected then the system outputs NO with probability .
1.3 Related Work
Congest model [Pel00] lower bounds via the “bottleneck technique” that rely on communication complexity lower bounds have been shown for MST and related connectivity problems in [DS+11] and for minimum vertex cover, maximum independent set, optimal graph coloring, all pairs shortest paths, and subgraph detection in [CKP17, CK18, Fis+18]. This approach has also been used to derive BCC lower bounds in [DKO14, HP15]. Becker et al. [Bec+16] define a spectrum of congested clique models parameterized by a range parameter , denoting the number of distinct messages a node can send in a round. Setting gives us the BCC model and setting gives us the CC model. They show the pair-wise set disjointness problem is sensitive to the value of in the sense that for every pair of ranges , the problem can be solved provably faster in the model with range than it can in the model with range .
Distributed lower bounds via the “edge crossing” argument have a long history in distributed computing – see [KMZ87] for an example in the context of proving message complexity lower bounds. More recent examples [KKP10, BFP15, PP17] appear in the context of proof-labeling schemes. Informally speaking, a proof-labeling scheme consists of a prover who labels the vertices of the input configuration with labels and a distributed verifier who is required to verify a predicate (e.g., do the marked edges form an MST?) in one round, using the help of the prover’s labels. The verification complexity of a proof-labeling scheme is the size of the largest message sent by the verifier. Patt-Shamir and Perry [PP17] show an lower bound on the verification complexity of MST in the broadcast congested clique model. An lower bound in the KT-0 version of BCC for deterministic Connectivity algorithms follows from this result. The high level idea is that if there were a faster BCC Connectivity algorithm, the prover could use the transcript of the algorithm at each vertex as the label at . The verifier could then broadcast these transcripts and locally, at each vertex , simulate the algorithm at . Baruch et al. [BFP15] show that if there is a deterministic proof-labeling scheme with verification complexity , then there is a randomized proof-labeling scheme with one-sided error having verification complexity . Combining this with the fact that MST verification has a deterministic proof-labeling scheme with verification complexity [KKP10], leads to a randomized proof-labeling scheme with verification complexity for MST [BFP15, PP17]. This needs to be contrasted with the fact that we show an lower bound for Connectivity in KT-0 BCC even for constant-error Monte Carlo algorithms.
There have been recent attempts to combine the edge crossing and bottleneck techniques to obtain lower bounds for triangle detection in the Congest model [Abb+17, Fis+18]. In particular, [Fis+18] provide an lower bound for deterministic algorithms solving triangle detection in the KT-1 Congest model with -bit bandwidth.
2 Technical Preliminaries
Polygamous Hall’s Theorem.
Let be a bipartite graph. A -matching is a subgraph consisting of a set of nodes where each has edges to nodes in the set such that and for , . The size of a -matching is the number of connected components in the subgraph.
Theorem 2.1** (Polygamous Hall’s Theorem).**
Let be a bipartite graph. If for every we have then has a -matching of size .
Proof.
Make copies of each node in while keeping the same. Now for every we have and by Hall’s marriage theorem, we have a matching in the modified bipartite graph which is a -matching of size in the original graph. ∎
Yao’s Minimax Theorem.
The standard way to prove lower bounds on -error randomized algorithms is by invoking Yao’s Minimax Theorem [Yao77]. Let denote the minimum round complexity of any -error randomized algorithm that solves . Let denote the distributional round complexity of , which is the minimum deterministic round complexity of an algorithm whose input is drawn from the distribution (known to the algorithm) and the algorithm is allowed to make error on at most fraction of the input (weighted by ).
Theorem 2.2** (Yao’s Minimax Theorem).**
For any problem ,
Yao’s Minimax Theorem reduces the problem of proving a randomized lower bound to the task of designing a “hard” distribution that produces high distributional complexity.
Lower bound for Partition.
The total number of distinct partitions on a ground set of elements is given by the * Bell number* . It is well known that . This means that the number of different possible input pairs that Alice and Bob can receive in the Partition problem is . Define the matrix such that if and otherwise. Note that is a matrix. Theorem 2.3 shows that this matrix is non-singular.
Theorem 2.3** ([DW75, Wel10]).**
* where is the Bell number*
Therefore by Lemma 1.28 of [KN97] we get the following corollary.
Corollary 2.4**.**
The deterministic 2-party communication complexity of Partition is
Information Theory.
Let be a distribution over a finite set and let be a random variable distributed according to . The entropy of is defined as and the conditional entropy of given is where is the entropy of the conditional distribution of given the event . The joint entropy of two random variables and , denoted by , is just the entropy of their joint distribution.
The mutual information between random variables and is and the conditional mutual information between and given is . See the first two chapters of [CT06] for an excellent introduction to the basics of information theory.
3 Lower Bounds in the KT-0 model
This section is devoted to proving the following theorem. As mentioned earlier, our lower bound applies to the simpler “one cycle vs two cycles” problem which we will call TwoCycle. In this problem, the input is promised to be either a single cycle or two disconnected cycles, each of length at least 3 and the goal is to distinguish between these two types of inputs.
Theorem 3.1**.**
For a sufficiently small constant , the -error randomized round complexity of the TwoCycle problem in the BCC KT-0 model is bounded below by .
Two KT-0 instances and are said to be indistinguishable after rounds of an algorithm if the state of each vertex (i.e., the initial knowledge and the transcript at that vertex) after rounds is the same in both the instances. We first introduce a technical tool called indistinguishability via port-preserving crossings. This tool has been used to show distributed computing lower bounds in several settings [KMZ87, KKP10, BFP15, PP17] and we heavily borrow notation from [PP17]. For an edge we use the notation to denote that is connected to port at and to port at . For this notation to be unambiguous, we must think of the edge as a directed edge even though the graph itself is undirected.
Definition 3.2** (Independent Edges [PP17]).**
Let be an instance with input graph and let and be two edges of . The edges and are said to be independent if and only if are four distinct vertices and . A set of input graph edges is called independent if every pair of edges in the set is a pair of independent edges.
Definition 3.3** (Port-Preserving Crossing [PP17]).**
Consider an instance with input graph . Let and be two independent edges of , and let and be two corresponding network edges in . Let be eight ports such that . The crossing of and in , denoted by , is the instance obtained from by replacing and in with the edges and and rewiring the edges so that and . (See Figure 1.)
The following lemma establishes a standard connection between indistinguishability and port-preserving crossings (henceforth “crossings”) and is in fact the main motivation for defining crossings. For simplicity, we say that a node sends the character to denote the fact that the node remains silent. Therefore, the events of a node broadcasting a [math], a , or remaining silent can be described as sending the characters or respectively.
Lemma 3.4**.**
Let be an instance with input graph and let and be two independent edges of . If send the same sequence and send the same sequence in the first rounds of the algorithm, then is indistinguishable from after rounds.
Proof.
We will prove the lemma by induction on . The initial knowledge of each vertex in and is the same so the statement is true for .
Assume that the lemma is true for some round . Therefore, the characters broadcast by the vertices in round will be the same in both the instances. From the definition of port preserving crossing it is clear that and differ only in four edges, , , , and . Therefore, all vertices except , and will receive the same characters across all their ports in round in both the instances and hence will have the same state in both instances after round .
Let the port names of the four edges in and be as in Definition 3.3 and Figure 1. In , the vertex will receive the characters broadcast by through ports respectively and in it will receive the characters broadcast by through ports respectively. Note that and broadcast the same message in round since they send the same sequence in the first rounds and therefore, the state of after round will be the same in both instances. We can make similar arguments for and as well. Therefore, the state of each vertex after round is the same in both and which proves the induction step as well as the lemma. ∎
As a “warm-up”, we first sketch an easy lower bound for randomized Monte Carlo algorithms that make polynomially small error, i.e., error for constant . By Yao’s minimax theorem (Theorem 2.2), it suffices to show a lower bound on the distributional complexity of a deterministic algorithm under a hard distribution. Consider the following hard distribution : Let be an arbitrary instance such that the input graph of is a one-cycle on vertices. Let be an arbitrarily chosen set of exactly independent edges 333Adding an edge to invalidates at most two other edges, and therefore we can always find an independent set of size . and let be the set of all instances where , and therefore, . The hard distribution places probability mass on the instance and uniformly distributes the remaining probability mass among the instances in . Now, given a -round deterministic algorithm we can assign a -character label to each edge obtained by concatenating the characters broadcast by and . Here each character in the label belongs to the alphabet . The pigeon-hole principle implies that there is a set , , of edges in with identical labels. Then by Lemma 3.4, for any , and are indistinguishable after -rounds of . Since cannot make an error on , it makes errors on all instances where . Since assigned the probability mass 1/2 uniformly to all instances in , the probability that makes an error is at least . Therefore, if , this error becomes which is much larger than – a contradiction, implying that and leading to the following theorem.
Theorem 3.5**.**
For any constant , if then the -error randomized round complexity of the Connectivity problem in the BCC KT-0 model is .
Proof.
Note that since the probability mass on is so large, any algorithm with permissible error probability must output YES on and therefore, it will also output YES on all instances that are indistinguishable from .
Given a -round deterministic algorithm we can assign a -character label to each edge where each character belongs to the alphabet . The label is assigned such that the head sends the character of the label and the tail sends the character of the label in round for all edges. By using the pigeon hole principle, we see that there is a set , , of edges in with identical labels. By Lemma 3.4, for any , and are indistinguishable after -rounds of . Therefore, any round algorithm will make an error on instances where and this makes the error at least . Therefore, if , this error becomes which is much larger than . ∎
The hard distribution that led to the above theorem fails to give even a super-constant round lower bound for constant error probability. This is because for any constant , there is a constant such that the error probability of algorithm is smaller than , leading to no contradiction.
3.1 A Lower Bound for Constant Error Probability
To get around this problem, we start with the observation that a two-cycle instance obtained from , can also be obtained by crossing edges in other one-cycle instances, i.e., for edges in an instance . Thus, as the algorithm executes, even though ceases to be indistinguishable from , it may continue to be indistinguishable from . This suggests that we should be considering all one-cycle and two-cycle instances and all the edge crossings that lead from one-cycle instances to two-cycle instances. This motivates the definition below of a bipartite indistinguishability graph with all one-cycle and two-cycle instances as vertices. In the proof of Theorem 3.5, when we placed the entire probability mass on a single “star” indistinguishability graph with being the central node and instances in being the leaves, we ran into trouble because the degree of in this “star” shrank too quickly with the number of rounds, . If we consider the full indistinguishability graph, we have more leeway. Specifically, showing the existence of a large matching in the indistinguishability graph would be helpful since the algorithm is forced to make an error at one of the two endpoints of each matching edge. We formalize this intuition below, first with some definitions.
Let the set of distinct one-cycle and two-cycle instances be and respectively let be a probability distribution on these. Let be a -round deterministic KT-0 algorithm which solves the TwoCycle problem correctly on fraction of input in the support of (recall, is a constant). For any instance , call an edge in the input graph of active with respect to strings iff broadcasts the sequence given by and broadcasts the sequence given by in the first rounds of the algorithm . We call an edge active if the strings are clear from the context.
Definition 3.6** (Indistinguishability Graph).**
Let be a non-negative integer and let be two strings of length . The indistinguishability graph with respect to messages and after rounds of algorithm is a bipartite graph where is the set of all one-cycle instances and is the set of all two-cycle instances and there is an edge iff and and there exist two active independent directed edges and in the input graph of such that .
We now propose to use a rather natural hard distribution that assigns probability mass distributed uniformly among the instances in and the remaining probability mass distributed uniformly among the instances in . We first prove Lemma 3.7 that plays a crucial role in our overall proof by essentially showing that every one-cycle instance has sufficiently many two-cycle neighbors in with high degree. This in turn is used in Lemma 3.8 to prove that a Polygamous Hall’s Theorem (Theorem 2.1) condition holds for . This allows us to show that can be packed with “stars,” each with leaves. We need this generalized notion of a matching because as shown in Lemma 3.9, . Therefore, the probability mass assigned to an instance in is fraction of the probability mass assigned to an instance in . Thus, a “star” with its central node from and leaves from has roughly equal probability mass assigned to the YES instance and NO instances.
Lemma 3.7**.**
Consider an arbitrary instance that is a vertex of . If is the number of active edges of with respect to then for every , has at least neighbors of degree .
Proof.
A two-cycle instance will be a neighbor of iff and form a pair of crossed instances with respect to . Say where and . Note that will have two new input graph edges and both of which are active and all input graph edges of except for appear in the input graph of . Therefore, also has active edges with respect to . The degree of is determined by the number of active edges either cycle, i.e., if has active edges in one cycle and active edges in the other cycle then its degree in is since we can take one active edge from either cycle and cross them to produce a unique neighbor of .
For every active edge in the input graph of , we can associate a unique active edge such that has active edges in one cycle and active edges in the other cycle. Therefore, has exactly (or if ) neighbors having degree . This argument may not hold exactly for because and as described need not form a pair of independent edges in this case. Thus, the lemma follows. ∎
Lemma 3.8**.**
For the graph , consider an arbitrary set of one-cycle instances with degree at least . Let be the neighborhood of in . Then where is the smallest number of active edges in any instance in .
Proof.
Every has at least active edges, therefore by Lemma 3.7, there are at least neighbors of having degree for . Thus there are at least two-cycle instances in having degree . Therefore, we have , where is the harmonic number. ∎
Lemma 3.9**.**
.
Proof.
Let ( is the empty string) be the indistinguishability graph at round [math]. Note that in , every instance in has strictly positive degree since each instance has active edges. Therefore, we have and . Therefore, by Lemma 3.8, we have . Now we show that .
Since each instance has active edges, each one-cycle instance has degree because for each input graph edge of there are active edges independent of , which we can cross with to get a unique neighbor of . We need to divide by a factor of two because . And each two-cycle instance with the smaller cycle having length has degree since we can cross any two edges in different cycles to get a neighbor of .
Let denote the set of two-cycle instances with the smaller cycle having length for .
For every input graph edge in a one-cycle instance , there is exactly one input graph edge such that . Therefore, for , each one cycle instance has neighbors such that the smaller cycle is of length . And if is even, each one-cycle instance will have neighbors where both cycles have length instead.
We will now show that . To see this note that if we restrict our attention to the subgraph of spanned by instances in then we have a bipartite graph where each instance in has the same degree (or if ) and each instance in has the same degree . Therefore, the total number of edges incident on is and those incident on is . Since the number of edges should be the same counted from either side, we get . Now we finish the proof of the lemma with the following calculation:
[TABLE]
∎
Proof.
(of Theorem 3.1) Consider an arbitrary one-cycle instance after rounds of algorithm . Let be the strings that correspond to the largest set of active edges after -rounds of algorithm . We would like to count the size of this set of active edges. Recall that we orient each input graph edge of in a clockwise direction. Therefore, each input graph edge in can be labeled with a string of length which denotes messages sent across it from the head and the tail (in order) across the rounds. This means that there are at least input graph edges in that have the same messages sent across them. Therefore, the size of the set of active edges with respect to is at least .
By Lemma 3.8 and Theorem 2.1, we can say that there exists a -matching in of size . No matter what the algorithm outputs on any one-cycle instance, it will produce the same output on the matched two-cycle instances. By Lemma 3.9, we know that for any and , Therefore, each instance contributes to the error of the algorithm which means that any -round BCC algorithm will have total error at least a constant. This implies the theorem. ∎
4 Lower Bounds in the KT-1 Model
Our lower bounds in the KT-1 model are inspired by the work of Hajnal et al. [HMT88], which is concerned with 2-party communication complexity of several graph problems, including Connectivity. In their setup [HMT88], the input graph is edge-partitioned among Alice and Bob in such a way that both parties know and Alice and Bob respectively know edge sets and , were forms a partition of . One simple deterministic protocol that solves Connectivity in this setup is this: Alice sends all the connected components induced by to Bob, who can determine if is connected. The worst case communication complexity of this protocol is . Via reduction from Partition, Hajnal et al. [HMT88] show that there exists a family of input graphs such that for any equal sized edge partition, the communication complexity of Connectivity is .
It does not seem possible to reduce from this edge-partitioned version of 2-party Connectivity to Connectivity in the KT-1 model because KT-1 algorithms are vertex-centric and Alice and Bob may not hold all the edges they need to simulate vertices executing a KT-1 algorithm. We resolve this issue by designing a new reduction, from Partition to a vertex-partition version of 2-party Connectivity. In the Hajnal et al. [HMT88] reduction, Partition is reduced to Connectivity on a family of dense graphs. Motivated by our KT-0 lower bound for Connectivity for the TwoCycle problem, we are interested in deriving a KT-1 Connectivity lower bound for a sparse class of graphs as well. In what follows, we extend the reduction of Hajnal et al. from Partition to Connectivity in two important ways: (i) we reduce to a vertex-partitioned version of Connectivity and (ii) we reduce to a sparse special case of Connectivity that we call the MultiCycle problem, in which the input is either a single cycle or two or more cycles, each having length at least .
4.1 A Special Case of the Partition Problem
In order to establish a lower bound for MultiCycle, we now consider a special case of the 2-party Partition problem, which we call TwoPartition. The input to TwoPartition consists of partitions and of , for even , such that each part in and has exactly two elements in it. We will now use a linear algebraic argument to show that there is an deterministic lower bound on this special case of Partition also. The [math]- matrix associated with this problem is a sub-matrix of the matrix where if and otherwise (see Section 2). The matrix has dimension where . This fact follows from a simple counting argument. In the following theorem, we show that this sub-matrix has full rank.
Lemma 4.1**.**
* where .*
Proof.
We will prove a more general observation – every sub-matrix of a full rank matrix formed by choosing a subset of the rows and the corresponding columns has rank where . In other words, for all , is a full rank matrix.
Let be a diagonal matrix where if and if . It is easy to see that . Using basic properties of rank, and by Sylvester’s rank inequality 444For any two matrices , . We can prove this inequality by applying the rank-nullity theorem to the inequality ., .
Therefore, which means that some minor of having dimension needs to be of full rank. The only such candidate is the minor corresponding to the matrix because all other minors of dimension either have an all zero row or all zero column. Therefore, has full rank.
Now is a submatrix of where the rows and columns correspond to partitions of such that each part has exactly two elements in it. Therefore, the lemma follows since has full rank. ∎
By using Stirling’s approximation, it can be verified that . Then, by the rank bound and Lemma 1.28 of [KN97] we get the following corollary.
Corollary 4.2**.**
The deterministic 2-party communication complexity of TwoPartition is
We describe our reductions in the next two subsections. In section 4.2, we reduce the Partition (TwoPartition) problem to the vertex partitioned 2-party Connectivity (2-party MultiCycle) problem and in section 4.3, we reduce the 2-party Connectivity (2-party MultiCycle) problem to Connectivity (MultiCycle) in the KT-1 model.
4.2 Reductions from Partition and TwoPartition
Here we present two reductions, first from Partition to 2-party Connectivity and next from TwoPartition to 2-party MultiCycle. Alice is given a partition over the ground set where is the part of , which could possibly be empty if has fewer than parts. Similarly, Bob is given a partition . They construct a graph as follows: Alice creates vertex sets and whereas Bob creates the vertex sets and . Alice and Bob add edges for , independent of and . Alice adds edges between and that induce the partition on . That is, for every , Alice adds edges for all . There will be some vertices in that are not connected to any vertex, so Alice just adds an edge between these vertices and an arbitrary vertex . Bob similarly adds edges between the sets and . See Figure 2.
If and are instances of TwoPartition, that is, each part of and is of size exactly two, then we can modify the construction of by getting rid of the sets and . Note that in this case and where each and has size exactly two. If then Alice creates an edge between and and Bob does the same with for every pair in . With this modified construction, each vertex in has degree exactly and therefore, every connected component of will be a cycle. See Figure 2.
The following theorem encapsulates a crucial property of the graph which implies the correctness of our reductions.
Theorem 4.3**.**
If and are instances of Partition (or TwoPartition), then the partition induced by the connected components of on the vertices in and corresponds to the partition .
Proof.
Call two elements and reachable from each other if there exists a sequence of distinct elements such that , and each pair either belongs to the same part of or the same part of . Any partition in which all reachable elements are in the same part have both and as refinements.
We claim that two elements belong to the same part of if and only if they are reachable from each other. The backward direction is true because and are both refinements of . The forward direction is true because if and are not reachable from each other but still belong to the same part of then we can refine the part to be where is the set of all elements in that are reachable from and is the set of all elements in that are reachable from . It is easy to see that and are disjoint. Let be the partition where is further refined to be . Note that both and still remain refinements of the which contradicts the minimality of the join.
The theorem follows by observing that and are reachable from each other if and only if there is a path from to (and consequently from to ) in . ∎
4.3 Reductions from 2-party Connectivity and MultiCycle
We now show reductions from 2-party Connectivity to Connectivity in the KT-1 model and from 2-party MultiCycle to MultiCycle in the KT-1 model. Given an -round KT-1 algorithm , Alice and Bob will simulate the algorithm with as the input graph. Alice hosts vertices in and Bob hosts vertices in . For , the IDs of vertices , , , and are , , , and respectively. So both parties know the ID’s of all vertices as well as the ID’s of neighbors of all hosted vertices in and hence, the initial knowledge of hosted vertices.
In order to simulate round of , Alice and Bob need to compute the states of all hosted vertices after round of . The state of a vertex after round depends on the initial knowledge and the transcript of . Assume that Alice and Bob know the states of all the vertices they host after round . Alice and Bob send a message from to each other. These messages denote the characters their hosted vertices broadcast in round , in increasing order of ID. Therefore, they know the sender ID of a character from the position of the character in the message. This enables Alice and Bob to compute the transcript and hence the state after round of all hosted vertices .
Therefore, in simulating each round, Alice and Bob exchange exactly bits with each other and the total communication complexity of the protocol is . If solves the Connectivity or MultiCycle problems, then using corollaries 2.4 and 4.2 respectively and Theorem 4.3, we obtain the following result.
Theorem 4.4**.**
The round complexity of a deterministic algorithm for solving the Connectivity and MultiCycle problems in the KT-1 model is .
4.4 Information-theoretic Lower Bound for ConnectedComponents
Já Já [JJ84] proves a lower bound for 2-party ConnectedComponents and points out that his techniques may not work for decision problems, indicating that it might be easier to prove lower bounds for ConnectedComponents. This motivates us to consider the ConnectedComponents problem as a lower bound candidate, closely related to Connectivity, but for which we may be able to prove an lower bound in the KT-1 model, even for constant-error Monte Carlo algorithms. It turns out that we are able to prove this result by combining the reductions described in the previous section with information-theoretic techniques. We first define the 2-party problem PartitionComp which is closely related to Partition, but requires an output with a large representation. As in Partition, Alice and Bob are respectively given set partitions and of and at the end of the communication protocol for PartitionComp, Alice and Bob are required to output the join . From Theorem 4.3, we get that if there is a -round, -error Monte Carlo algorithm for ConnectedComponents in the KT-1 model, then there is an -error Monte Carlo protocol that solves PartitionComp with communication complexity .
Consider the following distribution over inputs of PartitionComp: Alice’s input is chosen uniformly at random from the set of all partitions and Bob’s partition is fixed to be the finest partition, i.e., . With fixed in this manner, and at the end of the protocol Bob learns . Since is chosen from the uniform distribution, it’s initial entropy is high – since the support of the distribution has size . Therefore Bob will learn a lot of information by the end of the protocol. This idea is formalized in the proof of the following theorem. This proof also has to deal with the complication that the protocol has constant error probability.
Theorem 4.5**.**
For any constant , the round complexity of an -error randomized Monte Carlo algorithm that solves the ConnectedComponents problem in the KT-1 version of the BCC model is .
Proof.
Using Yao’s minimax theorem (Theorem 2.2) we can assume that all protocols are deterministic but are allowed to make an error on -fraction of the input, weighted by . Although appealing to Yao’s theorem is not necessary, it allows us to simplify the exposition. Let denote the transcript of a 2-party protocol that solves PartitionComp and let denote the length of the longest transcript produced by on any input. We know that
[TABLE]
where the last equality follows from the fact that is fixed according to . From the definition of mutual information, . Alice’s input is uniformly distributed among all set partitions according to the hard distribution . Therefore . Let be the set of protocol transcripts that produce an error on the input . If then since the output of the protocol is . We are guaranteed that . Therefore, the second term can be bounded as follows.
[TABLE]
Where the last inequality follows from the fact that for any . This implies which proves that any -error randomized protocol that solves the PartitionComp problem has communication complexity of . This in turn implies that which proves the theorem. ∎
5 Future Work
In this paper, we used various techniques to obtain better lower bounds for Connectivity in the BCC model. However, these bounds are still quite weak and the gap between these lower bounds and the best upper bound is substantial. The fundamental question that motivated this paper, one that is still open is this.
Question 1**.**
Can we obtain round lower bounds for Connectivity in the BCC model or show that this is not possible by designing an algorithm running in rounds?
Another way to ask this question is can we obtain super-constant round lower bounds in the BCC model? It is worth noting again that we have a deterministic upper bound for Connectivity of [JN18] in BCC, whereas our results do not imply a better than lower bound in BCC.
A second open question, one that is more relevant to the techniques used in this paper is the following.
Question 2**.**
Can we get an lower bound on the randomized constant-error communication complexity for the Partition and TwoPartition problems?
Using the reductions in this paper, a positive answer to this question would imply an lower bound for Connectivity in BCC KT-1 model even for constant-error randomized algorithms.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Abb+17] Amir Abboud, Keren Censor-Hillel, Seri Khoury and Christoph Lenzen “Fooling Views: A New Lower Bound Technique for Distributed Computations Under Congestion” In Co RR , 2017 ar Xiv: http://arxiv.org/abs/1711.01623 v 3
- 2[Awe+90] Baruch Awerbuch, Oded Goldreich, David Peleg and Ronen Vainish “A Trade-Off Between Information and Communication in Broadcast Protocols” In J. ACM 37.2 , 1990, pp. 238–256 DOI: 10.1145/77600.77618 · doi ↗
- 3[Bec+16] Florent Becker, Antonio Fernández Anta, Ivan Rapaport and Eric Rémila “The Effect of Range and Bandwidth on the Round Complexity in the Congested Clique Model” In Computing and Combinatorics - 22nd International Conference, COCOON 2016, Ho Chi Minh City, Vietnam, August 2-4, 2016, Proceedings , 2016, pp. 182–193 DOI: 10.1007/978-3-319-42634-1\_15 · doi ↗
- 4[BFP 15] Mor Baruch, Pierre Fraigniaud and Boaz Patt-Shamir “Randomized Proof-Labeling Schemes” In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, Donostia-San Sebastián, Spain, July 21 - 23, 2015 , 2015, pp. 315–324 DOI: 10.1145/2767386.2767421 · doi ↗
- 5[CK 18] Artur Czumaj and Christian Konrad “Detecting cliques in CONGEST networks” In 32nd International Symposium on Distributed Computing (DISC 2018) Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018 URL: http://wrap.warwick.ac.uk/106950/
- 6[CKP 17] Keren Censor-Hillel, Seri Khoury and Ami Paz “Quadratic and Near-Quadratic Lower Bounds for the CONGEST Model” In 31st International Symposium on Distributed Computing, DISC 2017, October 16-20, 2017, Vienna, Austria , 2017, pp. 10:1–10:16 DOI: 10.4230/LIP Ics.DISC.2017.10 · doi ↗
- 7[CT 06] Thomas M. Cover and Joy A. Thomas “Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)” New York, NY, USA: Wiley-Interscience, 2006
- 8[DKO 14] Andrew Drucker, Fabian Kuhn and Rotem Oshman “On the Power of the Congested Clique Model” In Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing , PODC ’14 Paris, France: ACM, 2014, pp. 367–376 DOI: 10.1145/2611462.2611493 · doi ↗
