New $(\alpha,\beta)$ Spanners and Hopsets
Uri Ben-Levy, Merav Parter

TL;DR
This paper introduces new constructions of $( ext{alpha}, ext{beta})$ spanners and hopsets with nearly optimal stretch and size, improving bounds and extending applicability for large distances in unweighted graphs.
Contribution
It presents novel $( ext{alpha}, ext{beta})$ spanner and hopset constructions with improved stretch, size, and hop-bound guarantees, advancing the state-of-the-art in graph sparsification.
Findings
Achieves nearly optimal stretch of $O(rac{k}{d})$ for various distance ranges.
Constructs $( ext{alpha}, ext{beta})$ spanners with $ ext{alpha}=O( ext{power of }k)$ and size $O(n^{1+1/k})$.
Develops $( ext{alpha}, ext{beta})$ hopsets with improved hop-bound $O(k^{ ext{log}(3+9/ ext{epsilon})})$.
Abstract
An -spanner of an unweighted -vertex graph is a subgraph satisfying that is at most for every . We present new spanner constructions that achieve a nearly optimal stretch of for any distance value , and . We show the following: 1. There exists an -spanner with for any with expected size . This in particular gives spanners with and . 2. For any , there exists an -spanner with , and of expected size . This implies a stretch of for any , and for every $d\geq…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Theory Research · Limits and Structures in Graph Theory · Graph Labeling and Dimension Problems
New Spanners and Hopsets
Uri Ben-Levy The Weizmann Institute of Science, Israel. Email: [email protected].
Merav Parter The Weizmann Institute of Science, Israel. Email: [email protected]. Supported in part by an ISF grant (no. 2084/18).
An -spanner of an unweighted -vertex graph is a subgraph satisfying that is at most for every . A simple girth argument implies that any -spanner with edges must satisfy that . A matching upper bound (even up to constants) for super-constant values of is currently known only for as given by the well known spanners of Elkin and Peleg, and its recent improvements by [Elkin-Neiman, SODA’17], and [Abboud-Bodwin-Pettie, SODA’18].
We present new spanner constructions that achieve a nearly optimal stretch of for any distance value and . We also show more optimized spanner constructions with nearly linear number of edges. Specifically, for every and integer , we show the construction of spanners for and edges.
In addition, we consider the related graph concept of hopsets introduced by [Cohen, J. ACM ’00]. Informally, an hopset is a weighted edge set that, when added to the graph , allows one to get a path from each node to a node with at most hops (i.e., edges) and length at most . We present a new family of hopsets with edges and . Turning to nearly linear-size hopsets, we show a construction of hopset with edges and hop-bound of , improving upon the state-of-the-art hop-bound of .
Contents
1 Introduction
Compressing the distance metric of an undirected input graph up to a small approximation, or stretch has been subject to an extensive research over the years. An -spanner of a graph is a subgraph satisfying that . Letting for some fixed integer gives the standard multiplicative spanners [PU87, ADD*+*93a]. More generally, corresponds to spanners [PS89, BKMP05].
Althöfer et al. [ADD*+*93a] provided the first tight construction of multiplicative spanners with edges. These spanners are believed to provide the optimal size-stretch tradeoff assuming the girth conjecture of Erdős [EM70]. It has been widely noted, however, that this optimality notion, has some caveats as the girth argument by itself provides a stretch lower bound only for adjacent vertex pairs. The first indication that one can provide improved stretch for distant vertex pairs was given by the notion of spanners of Elkin and Peleg [EP04]. In their seminal work, they showed that one can compute an spanner with edges and , for every integer and . This, in particular implies that one can compute -spanners where for and with edges. Recently, Abboud, Bodwin and Pettie [ABP18a] showed that this tradeoff is nearly optimal at least for constant values of , ruling out the possibility for obtaining stretch value for considerably closer vertex pairs while keeping the same bound on the number of edges.
Another approach for obtaining improved stretch for non-adjacent pairs was suggested by the hybrid spanners of Parter [Par14]. For every integer , these spanners have edges and provide non-adjacent pairs a stretch of rather than . This stretch value is optimal for vertex pairs at distance , assuming the girth conjecture, but does not provide a significant improvement for pairs at distance . For instance, for pairs at distance , current spanner constructions still provide a stretch of , rather than a stretch of as might be attainable, or else be proven otherwise.
To summarize, the existing -spanner constructions currently provide a nearly optimal stretch in two extreme regimes: short distances and large distances . Our paper zooms into the missing intermediate regime of distances, aiming at providing the ultimate stretch for any value of . Since our stretch values are optimal up to constants, these constructions are useful when the stretch parameter is super-constant, i.e., for some function of the number of nodes (e.g., ). Also note that a lower bound of is unconditional in the girth conjecture, and holds by a simple girth argument. While obtaining spanners will provide the holy grail stretch of for the entire range of distances, our results are asymptotically very close to this goal. That is, our spanner constructions achieve the optimal stretch values (up to constants) for almost the entire range of distances. See Fig. 1 for a pictorial illustration for the current -spanner constructions with edges.
Hopsets.
Hopsets are fundamental graph structures introduced by Cohen [Coh00]. Since their introduction, they have been receiving considerably more attention recently [EN16a, ABP18a, HP19], due to their applications to shortest path computation in many computational settings e.g., parallel computing [KS97, Coh00, MPVX15, FL18, EN19b], dynamic graph algorithms [HKN18], streaming and distributed algorithms [Nan14, HKN16, EN16b, Elk17].
For an -vertex undirected weighted graph , a subset of weighted111The weight of each edge is . edges (not in ) is called hopset, if for any , it holds that
[TABLE]
where , and the weight function is defined as follows: for every , and for every , . The distance is the length of the shortest path from to that uses at most edges in . The first hopset construction by Cohen [Coh00] had edges and hop-bound of . Elkin and Neiman [EN16a] presented an improved construction of hopsets with and edges. The state of the art result is by Huang and Pettie [HP19] who proved that the emulators by Thorup and Zwick are in fact also hopsets with edges. A similar construction with slightly worse bounds has been independently shown by Elkin and Neiman [EN17].
Klein and Sairam [KS97] and Shi and Spencer [SS99] gave an efficient PRAM algorithm for computing exact hopset with hop-bound and linear number of edges. Abboud, Bodwin and Pettie [ABP18a] showed that any hopset with less than edges for any must have . This implies that the of Elkin and Neiman [EN16a] and Huang and Pettie [HP19] are nearly optimal for .
At the other extreme with respect to parameters, Huang and Pettie [HP19] observed that the distance oracle of Thorup and Zwick [TZ05] immediately implies hopsets with stretch , hop-bound and edges. In their paper, Huang and Pettie raised the following question concerning the existence of additional hopsets and specifically asked:
Are there other tradeoffs available when is a fixed constant (say 3 or 4), independent of ?
We answer this question in the affirmative, by presenting a new family of hopsets. For any and integer , we give the construction of hopsets with edges, in expectation. Thus taking gives constant hop-bound of , and an improved stretch of . We also show a construction of hopsets for the complementary range of . It is important to note that whereas for spanners with edges it must hold that by a girth argument, this lower bound does not hold for hopsets. For any constant value of , our new family of hopsets in fact satisfies that .
Application to Shortest Path Computation.
The efficient computation of spanners and hopsets lead to some immediate applications for fast shortest path computation, see e.g., [EZ06] and [EN16a]. Interestingly, the parameter of these structures effects not only the quality (or approximation) of the solution, but might also determine the time complexity. For example, in the streaming model, the number of passes for computing APSP approximation is linear in . This further motivates the study of spanners with a considerably improved on the expense of having a constant approximation rather than approximation. For example, our new spanners with leads to approximation for the APSP problem using only passes. This should be compared against the current approximation, but with passes.
Turning to the distributed computing models, the parameter also determines the locality of the spanner computation. Specifically, an immediate outcome of our constructions is an algorithm that computes an spanner with rounds, almost matching the (tight) round complexity of the standard multiplicative spanners (in the latter, all pairs suffer from a multiplicative stretch of ).
1.1 Our Contribution.
In this paper we provide improved -spanner constructions with nearly optimal stretch value, up to constants, for almost the entire range of distances. Our key result shows:
Theorem 1** (Almost Optimal -Spanners).**
For any integer , and an unweighted -vertex graph , one can compute an -spanner with edges such that for any .
This -spanner is almost optimal in the following sense. The stretch of is the best possible up to a constant factor based on the girth argument, the size of the spanner is optimal up to logarithmic terms, and the bounded stretch is provided almost for the entire distance range, i.e., excluding . We note that for this “problematic range” of our spanners still provide a considerably improved stretch over previous constructions. The spanner of Theorem 1 is obtained by two separate constructions. The first construction, which is also simpler, considers the range of distances . For this distance range we show an -spanner with . Using the terminology of spanner, we can say that our spanner is an spanner with and .
Theorem 2** (Spanners for Pairs at Dist. ).**
For any -vertex unweighted graph and integers , and , there is a subgraph of expected size such that for every pair of vertices and at distance in , it holds that . Hence, providing a stretch of .
The algorithm of Theorem 2 already provides a stretch of for all remaining distance values . Obtaining an improved stretch of for is considerably more challenging, and requires additional ideas and techniques. This has led to the construction of new spanners, that p the desired stretch of for almost the entire range of distances.
New Spanners.
Our key contribution is in providing a new spanner that provides a constant stretch already for vertices at distance at least . Up to the extra factor of in the exponent, this is the best that one can hope for based on a girth argument. In addition, these spanners also settle down the desired stretch of for any .
Theorem 3**.**
For any -vertex unweighted graph , any , and ,222The statement can work for any upon suffering from larger constants. there is a -spanner of of expected size .
By setting in the above, we get an -spanner with and , hence providing a constant stretch for every distance . On the other hand, for every constant value of , we can get and , thus providing a stretch of for . Prior to that construction, the known spanners are given by the spanners of Baswana-Kavitha-Kurt-Pettie [BKMP05]. Note that the spanners of Elkin and Neiman [EN19a] also provide333This is implicit in their analysis. a stretch of for vertices at distance . Our construction provides a constant stretch for this distance range, while keeping almost the same bound on the number of edges444In this paper we did not optimize for secondary order factors in the size bound. All our solutions have edges w.h.p..
While the spanners of Theorem 3 provide a constant stretch for , this constant might be large. For that purpose, we also consider spanners that provide a small as possible stretch , while keeping at most polynomial in . For instance, a simplification of the algorithm from Theorem 3 can also give -spanners with .
Lemma 1** (New Spanners).**
For any -vertex unweighted graph , integer and constant , one can compute a spanner of for and expected size .
Pettie [Pet09] showed a construction of nearly linear-size spanner that provides a constant stretch of for vertices at distance . The spanners of Lemma 1 provides a stretch of for this range of distances, and also work for any .
New Hopsets.
Hopsets, the cousins of spanners and emulators, have received quite a lot of attention recently from the graph theoretical and the algorithmic prescriptive. Currently, hopsets constructions are known only for a narrow regime of values. A particular setting that attracted a lot of attention is where . Since all existing constructions provide a fairly large hop-bound, in this paper we resort to the constant stretch of . We show that this relaxation can significantly reduce the hop-bound. Specifically, we discover a new family of hopsets that is technically related to the spanner constructions described above.
Theorem 4** (New Hopsets).**
For any -vertex weighted graph , integer and such that , one can compute an hopset for and for some constant . The number of edges is bounded by edges in expectation, where is the aspect ratio of the graph555The ratio between the largest and smallest distances between vertex pairs in ..
For example, by setting , we get an hopset with constant stretch of , and an almost linear hop-bound . This brings us very close to the ultimate holy-grail construction of hopsets.
We also consider the other direction of minimizing the stretch as much as possible, while keeping the hop-bound to be polynomial in . As with spanners, this setting is considerably simpler (compared to that of Thm. 4), and we have the following:
Lemma 2** (New Hopsets).**
For any -vertex weighted graph , integer and , one can compute a hopset where with expected size .
An interesting reference point is for , i.e., where the hopset has a nearly linear size of edges. In this setting, Lemma 2 gives for example stretch and . Lemma 2 gives a constant stretch but with hop-bound . This should be compared with the hopsets of [HP19] and [EN16b] that provide a hop-bound of . See Figure 2 for comparison with existing work.
Remark. We note that Gitlitz, Elkin and Neiman [GEN19] independently provided different constructions for spanners and hopsets with slightly larger vlues than those obtained in Theorems 1 and 2.
Applications to Shortest Paths.
We also show the efficient computation of our simplified spanners and hopsets in various computational settings. This has direct implications to APSP computation. Elkin and Neiman [EN19a, EN16a] specified the implementation details of their hopsets and spanners, along with some immediate applications to shortest paths computation in several computational settings. In the non-centralized settings (e.g., distributed, streaming, etc.), the value of effects not only the approximation quality of the solution, but rather also determines the locality of the problem. Our simplified spanners and hopsets is similar, implementation-wise666I.e., the steps that determine the computational cost are quite similar., to the spanners and hopsets of [EN19a] and [EN16a]. Therefore we can get these applications, almost for free, while enjoying an improved running time due to our improved , upon suffering from a slightly larger stretch of in the centralized regime and in the distributed and streaming regimes rather than . For example, we can have the following:
Lemma 3** (Approx. APSP).**
For every -vertex unweighted graph and any parameters , there exists a streaming algorithm that computes a approximation for the APSP in the multi-pass streaming model for in either (1) space with high probability and passes, or (2) with space in expectation and passes with high probability.
This is the analogue of Cor. 21 in [EN19a] only that they have approximation with passes, rather than approximation with passes.
1.2 Technical Overview.
Throughout, we consider a fixed stretch parameter of , and restrict the number of edges in the output spanners and hopsets to edges in expectation.
1.2.1 New Spanners
The starting point for our algorithms is the observation that existing multiplicative spanner constructions (e.g., Baswana-Sen [BS07], Thorup-Zwick [TZ05]) provide a considerably improved multiplicative stretch, i.e., of , for edges incident on sparse vertices. By sparse, we mean vertices whose -ball contains a small number of vertices. To be more concrete, we start by describing some useful properties of the Baswana-Sen Algorithm.
Useful Properties of the Baswana-Sen Algorithm [BS07].
The Baswana-Sen algorithm consists of steps of clustering. A clustering is a collection of vertex disjoint sets which we call clusters. Every cluster has some special vertex which we call the cluster center. The set of clustered vertices is . In the high level, the Baswana-Sen algorithm computes levels of clustering . In each clustering step , given the clustering , the algorithm computes a clustering , along with a subset of edges that “takes care” of the newly unclustered vetices (i.e., those that belong to a cluster in , but do not belong to the clusters of ). The clustering and the output sugbraph have the following useful properties:
- •
(1) and in expectation.
- •
(2) The radius of each cluster is at most . Specifically, for each cluster , the subgraph contains a tree rooted at the cluster center of , spanning all the vertices in and has depth at most .
- •
(3) For every unclustered vertex , for every .
Warming Up: Spanners with and .
To illustrate the essence of our constructions, we start by showing an algorithm for computing an spanner with and . As we will see, with this approach, a stretch of is the best that one can get for . Obtaining the ultimate stretch of for every will require additional ideas, and a considerably more delicate analysis.
The first phase of the algorithm applies a truncated variant of the Baswana-Sen algorithm in which only the first clustering steps (out of the many steps) are applied. As a result, we get a cluster collection containing clusters (in expectation), and a subgraph that takes care of all edges incident to the non-clustered vertices (i.e., vertices not in ) by Property (3).
In the second phase, the algorithm computes a cluster-graph whose nodes correspond to the clusters of . Every two clusters are connected by an edge in iff where are the centers of the clusters respectively. The algorithm then computes a multiplicative spanner on this cluster graph. The edges of this spanner are translated to -edges as follows. For each edge in the spanner , the shortest path in between and is added to the spanner . By property (3), contains nodes, and thus contains edges in expectation. Overall, this step adds edges to the spanner.
The stretch argument: For the sake of this intuitive explanation, we will show that the spanner provides a stretch of for vertex pairs at distance . Fix such pair and let be their shortest path in . If has at most one clustered vertex (i.e., vertex appearing in the clusters of ), then all the edges in are incident to non-clustered vertices, and thus by property (3), for each edge , .
Otherwise, let and be the far-most clustered vertices on the path , where (respectively, ) is the closest clustered vertex to (respectively, ). By property (3) again, all edges on the path segments and enjoy a stretch of at most in the spanner. It remains to consider the segment . Let be the clusters of respectively in . Since the radius of these clusters is at most (by property (2)), we have that and thus and are neighbors in . Since is a spanner of , we have . Finally, as each edge in is translated into a path of length in , we have that and are connected in by a path of length in , concluding that as desired. The complete algorithm appears in Sec. 2.
The challenge in obtaining a multiplicative stretch .
We note that this algorithm, as is, cannot be extended to provide an improved stretch of for distances for the following reason. Let be a parameter that determines the number of the Baswana-Sen steps applied in the first phase of the algorithm. Then, after applying steps of the Basawna-Sen algorithm, we end with clusters in . By property (3), the stretch obtained for all unclustered vertices (i.e., not in ) is bounded by . In the second phase, a cluster graph with nodes is defined. Each node of corresponds to a cluster in , and two clusters are connected in , if their distance in is at most . To keep the number of edges in the final spanner small777Any standard -spanner on vertices contains edges. the algorithm can only afford the computation of -multiplicative spanner . Each edge in this spanner is translated to a path of length in the final spanner. Thus the algorithm adds at most edges in this phase. Overall, we get a stretch of on all the edges incident to the non-clustered vertices (those that do not appear in ), and a stretch of between every pair of clustered vertices at distance at most in . The optimal stretch is therefore achieved for .
In the next paragraph, we explain how to bypass this obstacle by adding a crucial intermediate phase to the algorithm.
A New Three Stage Approach for Spanners.
We next explain the high level ideas to obtain an spanner with a multiplicative stretch and an additive stretch for some constant and any . By taking , it provides a constant multiplicative stretch for all pairs at distance . By setting , we get a multiplicative stretch of for all pairs at distance . In the following, we zoom into a fixed distance value and describe the high-level construction of an -spanner with . The same procedure will be repeated for every (in fact, it will be sufficient to repeat it for every class of distances ). The algorithm has three phases. The initial clustering stage applies a truncated Baswana-Sen algorithm, running only its first steps. This results in a clustering containing clusters in expectation of radius , and a subset of edges . Property (3) of the Baswana-Sen algorithm guarantees a stretch of on all edges incident to the unclustered vertices (i.e., vertices not in ). Note that at this point the number of clusters is too large to be able to compute a -spanner on the cluster graph, and terminate. The purpose of the next stage is to rapidly reduce the number of clusters to the ultimate number of clusters while keeping the radius of this clustering bounded by .
The intermediate superclustering stage is the most delicate part of the algorithm. It consists of phases of superclustering. This step is similar in flavor to the spanner construction by Elkin-Neiman [EN19a], with several key differences that we explicitly state.
A supercluster is a collection of vertex-disjoint clusters. Every supercluster has a special vertex , that is denoted as the supercluster’s center. We denote be the set of vertices in the supercluster, that is . A superclustering is a collection of vertex-disjoint superclusters. The radius of a supercluster is defined by . Each phase of the superclustering procedure starts with a clustering with clusters of radius . The output of the phase is a clustering with clusters and radius , as well as a collection of edges added to the spanner that takes care of the vertices that stopped being clustered at that phase (i.e., appearing in but not in ). We now describe the high level structure of this phase.
The phase has steps of superclustering. Starting with the superclustering , each step gets as input a superclustering , where the radius of each supercluster is bounded by . Initially, is simply the radius of the clusters in . It then outputs a new superclustering by applying the following sequence of operations:
- •
Augmentation: We set an augmentation parameter . Each vertex at distance at most from a center of some supercluster in the current superclustering, adds its shortest path to its closest center to the spanner (without being added to that supercluster).
- •
Sub-sampling: Each supercluster gets sampled into with probability of where is the number of clusters in .
- •
New Superclustering: Each cluster (belonging to any supercluster ) at (center) distance at most from a center of a sampled supercluster, joins its closest sampled supercluster and adds the shortest path between their centers to the spanner. The center of the sampled supercluster maintains its role.
- •
Handling Lost Clusters: Each other cluster (that is too far from the sampled superclusters), adds to the spanner a shortest path from its center to the center of every supercluster provided that their distance at most . This cluster will no-longer appear in the superclustering.
The augmentation parameters , as well as the precise number of external phases , and internal superclustering steps are all set in a very delicate manner. We note that this additional augmentation step is not applied in the Elkin-Neiman’s algorithm [EN16a], and we find it to be quite useful in our stretch analysis. The output clustering of the phase consists of a cluster for every supercluster where is the last superclustering of that phase.
At the end of all these phases, we are left with a clustering with clusters. The augmentation parameters are set in a way that guarantees that the radius this clustering is bounded . Before starting the final phase, every unclustered vertex at distance at most from some cluster center, adds to the spanner its shortest path to its closest center.
The final clustering-graph phase computes a cluster graph where each cluster in corresponds to a node in that graph. Two clusters are connected in if their center-distance is at most . The algorithm then computes a spanner containing at most edges. Finally, these spanner edges are translated into edges in , by adding to the final spanner the shortest path between the centers and for each edge . It is easy to see that this step adds edges to the spanner.
The analysis is based on providing distinct stretch guarantees depending on the precise step888The step is the ’th step of the ’th phase. in which the vertex stopped being clustered. Formally, a vertex is said to be -unclustered if it belongs to the superclusters of but does not belong to the superclusters of . The analysis shows that the later a vertex stopped being cluster the stronger is its stretch guarantee in the following sense. For every -unclustered vertex , the edges added to the spanner in phase guarantee that for every vertex at distance of from , where grows with both and . The key point is that this stretch bound holds even if is unclustered. We complete the argument by considering any - shortest path (of any length), and dividing it into consecutive disjoint segments of possibly varying lengths. The length of each segment depends on the step in which certain vertices along the path stopped being clustered. We then show that for each segment (except perhaps the last one), the spanner provides a multiplicative stretch of between the endpoints of this segment. A detailed algorithm description appears in Sec. 3.
1.2.2 New Hopsets
Our hopsets constructions bare similarities to the spanner algorithms, but include several modifications. First, our hopset is defined for weighted graphs whereas in the spanners the graph is required to be unweighted. The key difference will be in the way that we handle the sparse vertices. In the spanners above, we applied a truncated variant of the Baswana-Sen algorithm. Here, we will use the classic hopsets that followed by the distance oracle of Thorup and Zwick [TZ05]. To explain the ideas in their cleanest and most simplified form, in the below high-level description we restrict attention to unweighted graph and mainly explain the first non-trivial construction of hopsets. We start by over-viewing the construction of hopsets, and highlight their key properties.
A Short Exposition of Hopsets.
The construction of the distance oracle by Thorup and Zwick [TZ05] is based on an hierarchical collection of centers . Each is obtained by sampling each independently with probability of . The pivot of every vertex denoted by is the closest vertex in to . For every , its bunch contains all vertices in that are closer to than . Thorup and Zwick showed that for every , in expectation. As observed by Huang and Pettie [HP19], the collection of bunches translates into a hopset as follows: for every and , add to an edge of weight . Since each bunch is of size , the output hopset has edges. The key property that will be used by our algorithms is as follows. Fix a pair of vertices and define as the minimal index such that is in the bunch of or vice-versa. By construction, . By the stretch argument of the distance oracle of Thorup and Zwick [TZ05], one can show that the hopset contains a two-hop path between and that goes through the common th pivot, and the path has length at most . We are next explain the construction of the hopset.
Warming Up: Hopset with .
As common in hopset constructions, we fix a distance range and describe a construction that provides the desired stretch and hop-bound for all pairs at distance . The same procedure is then applied for each of the distance ranges, where is the aspect ratio of the graph.
The hopset algorithm has two phases. First it computes the hopsets based on the Thorup-Zwick distance oracle. In fact, for our purposes it will be sufficient to apply only the first steps of the construction, i.e., computing the center sets , and adding hops to the hopset based on the bunches for every and every .
In the second phase, the algorithm computes a clustering centered at the vertices of as follows. Each vertex at distance from joins the cluster of its closest center in . This defines a cluster collection of (in expectation) of vertex-disjoint clusters of radius at most . In the hopset, we add a hop between each clustered vertex to its cluster center .
Now, the algorithm computes the cluster-graph in which each cluster in corresponds to a node, and two clusters are adjacent iff the distance between their centers is at most . We note that the graph will be unweighted even when the graph is weighted, as each edge in corresponds to a path of length at most in . Letting be a multiplicative spanner of , for every edge , the algorithm adds the hop to the hopset, where are the centers of respectively.
Why it works? First we bound the size of the hopset by . The first phase adds a TZ-hopset with edges. In the second phase, we add one hop for every edge in the spanner . Since this spanner has edges, it adds hops overall. In addition, each clustered vertex is connected with a hop to its cluster center, and as the clusters are vertex disjoint, it adds hops.
Next, we consider the stretch and hop-bound argument for a fixed pair and at distance in . Let be their shortest path. A vertex is called clustered if it belongs to the clusters of . By definition, a vertex is clustered iff . First, consider the case where at most one vertex on is clustered. The argument goes by partitioning into disjoint consecutive segments of length . For each such segment we show that in the TZ-hopset there is a two-hop path between and of length at most . To see this, consider an unclustered vertex and let be the minimum index satisfying that . By the properties of the TZ-hopsets, . Since , and , we get that . We therefore have segments on , for each the TZ-hopset provides a two-hop path of length . In total, we get a - path of hops, and of total length as required. This establishes the stretch and hop-bound argument for a path containing at most one clustered vertex.
The other case follows by the second phase of the algorithm. We consider the far-most clustered vertex pair and on the - shortest path. The stretch and hop-bound argument for the subpaths and follows by the argument above, and thus it remains to consider the path . Let and be the cluster of and respectively. Since , we get that the clusters and are adjacent in , and therefore the hopset contains a path of at most hops between between the centers of and . As each such hop has weight of , overall the hopset contains a path with hops connecting and , of total length as required. This completes the high-level idea of the construction, see Sec. 5.1 for more details.
Three Stage Approach for Hopsets.
The computation of the hopsets for every is very similar to the high level description mentioned above. The complementary range of is considerably more involved. It also has a three stage structure in a very similar manner to the spanners. The first stage applies a truncated TZ hopset construction restricting to the first levels of clustering. Letting be the centers in the -level of the Thorup-Zwick algorithm, the second phase computes an initial clustering with centers of . The vertices that do not belong to the clusters of are called unclustered. For those vertices, the correctness will follow by the TZ-hopsets. The remaining clustered vertices are handled in two stages. A key stage of superclustering which rapidly reduces the number of clusters to , and a final stage in which the number of clusters is small enough, to allow the computation of an -spanner on that cluster graph. A more detailed description appears in Sec. 5.2.
Open Problems.
The most important open problem left by this work concerns the existence of spanners with edges for and . This would provide a nearly optimal stretch, up to constants, for the entire range of distances. With our current constructions one can only get spanners with . Alternatively, for a multiplicative stretch of , we currently get . Note that the lower-bound constructions of Abboud, Bodwin and Pettie [ABP18a] are only tight for constant values of , and hence it might still be possible to even obtain spanners with edges. Another interesting open problem concerns the tightness of our hopsets constructions. We present a new family of hopsets with edges for any constant . The most critical question is whether any hopset with edges must satisfy that .
1.3 Preliminaries
Graph Notations and Definitions.
We consider an undirected -vertex graph , where is the set of vertices and is the edge-set. let be the neighbor of in . When is clear from the context, we may simply write . Unless specified otherwise we assume to be unweighted. For we denote by the distance from to in . Similarly, for any subgraph we denote by the distance between the vertices in . For any vertex and integer , we denote by the set of vertices at distance at most from in , that is . When the context is clear we might simply write . By we denote the set of vertices at distance exactly from , that is . For a weighted graph , the aspect ratio denoted by is the ratio between the largest and smallest vertex-pair distances in . Unless stated otherwise, in our constructions, shortest path ties are broken in a consistent manner.
Hopsets.
Hopsets are fundamental graph structures introduced by Cohen [Coh00]. Let be an undirected weighted graph and be a set of edges called the hopset. In the graph the weight function is defined by letting for every and . Define the -limited distance in , denoted , to be the length of the shortest path from to that uses at most edges in . We call a -hopset, where , if, for any , we have .
Clusters and Superclusters.
A cluster is a subset vertices with a small weak diameter in . Every cluster has a special vertex that is denoted as the cluster center. A clustering is a collection of disjoint clusters, where the vertices of the clustering are denoted by . Note that is not necessarily .
In our algorithms we measure the distance between clusters by the distance between their centers respectively. Formally, define . In the same manner, for a vertex and a cluster , define . For a collection of clusters and a cluster , let . In the same manner, for a vertex and a cluster collection , let . Throughout, we break shortest-path ties based on IDs. For instance, the closest cluster in to a given vertex is the minimum-ID cluster in that satisfies that . The radius of a cluster is defined by .
A supercluster is a set of disjoint clusters, with one special vertex, namely, the center of superclusters, denoted by . Let be the vertices of the supercluster. A superclustering is a collection of vertex disjoint superclusters. Let denote the vertices of the superclustering . Similarly to clusters, for a pair of superclusters define . For a cluster and a supercluster , define . For a cluster and a superclustering , define . Similarly, for a vertex and a supercluster let , and for a superclustering , define . The radius of a supercluster is defined by .
1.4 Algorithmic Tools
Multiplicative Spanners of Baswana and Sen [BS07].
Our algorithms are based on a truncated variant of the Baswana-Sen algorithm, namely Procedure . The procedure gets as an input a graph , a stretch parameter and integer that determines the number of clustering steps. The algorithm begins with singleton clusters . Then, at every step , a clustering is defined based on the given clustering . Every cluster in is sampled with probability to . Vertices that are not adjacent999We say that a vertex is adjacent to a cluster if contains at least one neighbor of . to sampled clusters are called unclustered, and they do not appear in the following clusters. For each unclustered vertex the procedure adds to the spanner one edge to each of its adjacent clusters in . Any other vertex joins its closest sampled cluster. Finally, the algorithm also adds to the output subgraph the spanning trees of each cluster (rooted at the cluster center) for every .
Fact 1**.**
*[Theorems 4.1,4,2 and Lemma 4.1 in [BS07]]
(1) For every , , (2) the radius of each cluster is at most , (3) the total number of edges, in expectation, added to is , and (4) for every edge satisfying that at least one of the endpoints is not clustered in , it holds that .*
Distance Oracles and Hopsets of Thorup and Zwick [TZ05].
Our hopsets constructions are based on the hopsets that are based on the distance oracles of Thorup and Zwick [TZ05]. The construction of the hopset is based on defining an hierarchical collection of centers , where each is obtained by sampling each independently with probability of and let . The pivot of every vertex denoted by is the closest vertex in to . For every , its bunch contains all vertices in that are strictly closer to than . That is, , and let . Note that we add a hop from each vertex to all vertices in . Thorup and Zwick showed that for every , in expectation. The collection of bunches translates into hopset as follows: for every and , add to an edge of weight . In our applications, we sometimes use a truncated version of the Thorup and Zwick construction for a given input parameter . Algorithm gets as input a graph , stretch parameter and an integer . The output of the algorithm is the TZ-hopset along with the subset , that is, level- centers that contains vertices in expectation.
Fact 2**.**
Fix a vertex pair and . Let the minimal index such that , and similarly let be the minimal index such that . Let . (i) For every , it holds that , and (ii) the hopset satisfies that . Furthermore (iii), it holds that in expectation.
Proof.
(i) By induction on . For this is clear since . Assume the claim is true for , thus and . For , since , it follows that
[TABLE]
(ii) W.l.o.g., let . Then, . Then by (i) it follows that . ∎
Throughout in all our hopset constructions, whenever an edge is added to the hopset it is given a weight of . To avoid confusion, the edges not in are referred to as hops.
Roadmap.
In Sec. 2, we present the first spanner construction that provides a stretch of for all vertices at distance at most . Then in Sec. 3, we present the key construction of spanners. Sec. 4 presents a simplified construction for spanners. Sec. 5 describes the new hopset constructions. Finally, in Appendix 6, we describe the implementation details and applications, and in Appendix B, we show improved constructions of spanners and hopsets.
2 Improved Spanners for Close Vertex Pairs
This section is devoted to showing Theorem 2. We consider unweighted graphs, and describe the construction of a spanner that provides a stretch of for every pair of vertices and at distance at most in , provided that . In the language of -spanner, this spanner can be viewed as an spanner. For simplicity, we fix a distance value and explain how to provide a stretch of using a subgraph of expected size . The same procedure will be repeated for every .
Description of Algorithm .
The algorithm consists of two key steps. The first step applies a truncated version of Baswana-Sen algorithm, applying only the first clustering steps. This results in a clustering of expected size , as well as a subset of edges added to the spanner (i.e., that takes care of the vertices that are not in the clusters of ). In the second step, the algorithm computes a cluster-graph as follows. The vertices of the cluster-graphs, denoted as super-nodes, are the clusters of . Every two clusters are connected in iff the distance between their centers in is at most . That is, . The algorithm then computes a spanner on this cluster graph, by using any standard multiplicative spanner procedure (e.g., the greedy spanner). Finally, the edges of this spanner are translated into -edges as follows: for every , add to the shortest path in between the centers of and . This completes the description of the algorithm. See Alg. 1 for a pseudocode.
We next analyze Algorithm and prove Thm. 2.
Proof.
Stretch: Fix a pair of vertices and at distance in for , and let be their shortest path in . We consider two cases. First, assume that no edge on has both its endpoints in the clusters of . Then, by Fact 1(4), we have that for every edge in . This gives a - path in of total length . Next consider the complementary case where contains at least one edge with its both endpoints clustered. Let be the leftmost and rightmost clustered vertices on . Let us define the following subpaths , and , such that . For every vertex , let be its closest cluster in with respect to the distance to the centers. Since , it holds that , thus by the properties of the -spanner , it holds that . As each edge in corresponds to a shortest path of length at most in , we have:
[TABLE]
Finally, again by Fact 1(4), we have that and . We therefore conclude that
[TABLE]
Size: We show that for each fixed distance value , the algorithm adds at most edges to the spanner. By Claim 1(3), the first steps of the Baswana-Sen clustering adds edges in expectation. By Claim 1(1), in expectation, contains clusters. Thus, the cluster-graph has super-nodes, and the size of its -spanner is in expectation. Since each edge in corresponds to a path of length at most in , we get that this step adds edges to the spanner. The size argument follows. ∎
3 New Spanners
In this section, we consider Theorem 3 and show the construction of spanners that provide a nearly optimal stretch for all pairs at distance and . For the sake of simplicity, we consider a fixed distance value , and prove the following lemma:
Lemma 4**.**
For any -vertex unweighted graph , any and integers , there is a subgraph of expected size such that for every , at distance in , it holds that .
We later on show how Theorem 3 follows by applying the construction of Lemma 4 algorithm for any . We now turn to describe our three stage procedure for computing the spanners of Lemma 4.
Algorithm .
The algorithm works in three stages. In the first stage it calls Procedure for clustering steps. This results with a clustering of clusters in expectation, each with radius at most .
In the second stage, the number of clusters is dramatically reduced to while keeping the radius of each cluster to . In the last stage, a cluster graph is computed on the collection of clusters, and a spanner is computed on that graph. Each edge in will be translated into a -path that is added to the final spanner.
Preliminary Stage: Truncated Baswana-Sen Algorithm.
The algorithm starts by applying the first steps of Alg. . This results in a clustering and a subgraph . By the properties of the Baswana-Sen algorithm, has edges in expectation and consists of clusters in expectation with radius at most . By Fact 1(4), we have:
Claim 1**.**
For every unclustered vertex and every vertex , .
Middle Stage: Superclustering.
For clarity of presentation, throughout we assume that divides , up to factor in the final stretch, this assumption can be made without loss of generality. The middle step consists of applications of Procedure . We refer to each application of this procedure by a phase. For clarity of presentation, we also assume to be an integer, and in Sec. A we describe how to remove this assumption. In each phase , the input to Procedure is a clustering of radius . The output of the phase is a clustering of radius , and a subgraph that takes care of all vertices that became unclustered in that phase. This output clustering is obtained by applying steps of supercluster growing. As we will see, the superclustering procedure will be very similar to Proc. only that we will now treat each cluster as a node.
Starting with the trivial superclustering whose radius is bounded by , in the step of phase for , the algorithm is given a superclustering . The radius of these superclusters will be bounded by . The algorithm defines a superclustering along with a subgraph as follows.
Each unclustered vertex satisfying that
[TABLE]
adds to the shortest paths to the closest center in , where
[TABLE] 2. 2.
Let be the collection of superclusters obtained by sampling each supercluster independently with probability of , where . 3. 3.
Set . Each cluster at center-distance at most from joins the supercluster of its closest center in , by adding the shortest path between the centers to . 4. 4.
The superclustering consists of all sampled superclusters in augmented by their nearby clusters (i.e., at center-distance at most ). The center of each sampled supercluster in maintains its role in the augmented supercluster. The radius of this supercluster will be shown to be bounded by . 5. 5.
Each cluster at center-distance larger than from , adds to the shortest path from its center to the center of any supercluster in at center-distance at most .
A vertex is said to be -unclustered if belongs to the superclusters of but does not belong to the superclusters of .
The parameter is set in a way that guarantees that for every -unclustered vertex and every , the subgraph contains a - path of length at most (hence providing a stretch of for every ).
Let be the output superclustering after the last step in the phase. Then the output clustering of the phase is given by . That is, all the clusters in a given supercluster in form a single merged cluster in . Finally, let . This completes the description of the phase. After phases, the output clustering is shown to contain at most clusters in expectation. Let be the current spanner.
Finalizing Stage: Spanner on the Cluster Graph.
Given the collection of clusters in , the algorithm first adds a shortest path from each unclustered vertex to its closest cluster in up to center-distance , if such exists. Next, a cluster graph is defined by connecting two clusters if their center-distance in is at most . That is, . Let be a -spanner of for . For edge , the shortest path between the centers of and is added to the spanner . This completes the description of the algorithm.
Stretch Analysis.
For the rest of the analysis, let thus , , , for , and let be the radius of the clusters in the last clustering at the end of the middle stage. For the sake of the stretch and size analysis, we will need the following two claims, bounding and , respectively, both of these claims follow by simple inductive arguments.
Claim 2**.**
For every , .
Proof.
For the definition of see Eq. (8). We first show by induction on that
[TABLE]
The base case, is trivial since . Assuming that the claim holds for , letting , we have:
[TABLE]
Therefore, we have:
[TABLE]
∎
We next turn to bound the radii of the superclusters in each phase of the algorithm. Note that in the step, since we add to each sampled supercluster, all clusters at center-distance at most , the radius of the new supercluster is increased by an additive term of at most .
Claim 3**.**
If then for each it holds that . In particular, for the final radius of the clustering it holds that .
Proof.
We prove the claim by induction on . The base case, , is trivial. Assuming the claim is true for we show the correctness for . Phase begins with clusters in with radii , and finishes after steps of Procedure with clusters of radius at most . We therefore bound . At step of we start with the superclustering of radius , and we add to the superclusters, clusters of radius at center-distance at most . Thus at the th step we increase the radii of the superclusters by an additive factor of at most . Combining this with the bound on of Claim 2:
[TABLE]
Thus by plugging , we get:
[TABLE]
The third inequality holds when , the fourth inequality follows as , and the fifth inequality follows as . Finally, by plugging we get that:
[TABLE]
where the fourth inequality holds for . ∎
Definition 1** (Clustered and Unlcustered Vertices).**
A vertex is [math]-unclustered if . A vertex is called -unclustered if . Finally, a vertex is called clustered if , i.e., it belongs to the last level of clustering.
In the following claims we show that that the spanner provides a low-stretch path from to any vertex at some fixed distance from in . The case of [math]-unclustered vertices follows by Claim 1, so it remains to consider -unclustered vertices for .
Claim 4**.**
For , for each -unclustered vertex , for any .
Proof.
Let . In the beginning of the step we add to the shortest path from any vertex at center-distance at most from to its closest center in the superclustering. In particular, since and , we add the shortest path from to the center of some supercluster . Since gets unclustered in this step, we add to the shortest path from its cluster, , to any supercluster which is in and is at center-distance at most . Since , we add the shortest path from the center of to the center of . Consequently, there is a path from to that goes through the centers of of length at most:
[TABLE]
where in the second inequality, the bound on follows by Eq. (3). We next show by induction on that
[TABLE]
The base case of holds vacuously. Assuming the correctness up to , by Eq. (8):
[TABLE]
Plugging Eq. (5) in Eq. (4) we get that: . ∎
Claim 5**.**
Let be vertices at distance in , and let be a shortest path between them in . If there is some clustered vertex , then .
Proof.
Let be the cluster to which belongs. In the beginning of the third stage of the algorithm we add shortest paths from unclustered vertices at center-distance at most to their closest cluster center in . Thus since and , it holds that both add their shortest paths to the centers of some clusters to the spanner. Observe that,
[TABLE]
Therefore, it holds that , thus . Since each edge translates into a path in of length at most , we have that:
[TABLE]
∎
We next complete the proof of Lemma 4.
Proof of Lemma 4.
Stretch. Let be vertices at distance in , and let be some shortest path between them in . First observe that if there is some clustered vertex the claim follows from Claim 5, so we assume there is no such vertex. Partition the path into consecutive segments the following way: denote and inductively define to be the vertex on at distance on the segment , where:
[TABLE]
Let be the index of the last segment, thus . For any , if is [math]-unclustered then by Claim 1 it holds that . If is -unclustered then since , by Claim 4 it holds that . Thus except for at most the last segment , the spanner provides a multiplicative stretch of to each of the other segments. For the last segment , if is [math]-unclustered then we are done by Claim 1. Otherwise, there exist such that is -unclustered, and by Eq. (3) and Eq. (3):
[TABLE]
Therefore by summing over all these at most segments, and plugging the bound on from Claim 3 we get that:
[TABLE]
Size Analysis.
By Fact 1(3), in expectation. Consider now the second stage. For any , step starts by adding shortest paths from unclustered vertices to their closest centers in the superclustering. Since each vertex adds its shortest path to its closest center, and since we break ties in a consistent manner, this step adds at most edges. The number of shortest paths added between unclustered clusters in each step can be bounded as follows.
Claim 6**.**
Fix a phase . For any , the algorithm adds in step a collection of shortest paths in expectation.
Each shortest path is of length at most . Therefore, by combining with Claim 6, in expectation. By summing over all phase, we get a total of edges.
For the size analysis of last stage we will need the following claim, which follows by a simple induction.
Claim 7**.**
For any , the expected number of superclusters in is , thus , in expectation.
From the claims above, we get that , in expectation. Consequently, (in expectation ). Since each edge in translates into a path of length at most , this step contributes edges to the spanner. Overall, , in expectation. ∎ Theorem 3 now follows by noting that:
Observation 1**.**
Let be a subgraph satisfying that for every at distance in , then is a -spanner of .
4 New Spanner
In this section, we show an optimized variant of our spanner for the case where . For the purpose of efficient implementation, we settle for a slightly worse value of . In Sec. B.1, we show an improved construction that achieves the bounds of Lemma 1. Our main result is:
Lemma 5**.**
For any -vertex unweighted graph , integer and , one can compute a spanner with , and expected size .
We note that unlike the constructions in earlier sections, this construction works for all distances , and there is no need to consider each distance class separately.
Algorithm Description.
For simplicity, we assume throughout that is an integer. The algorithm contains clustering phases. Starting with the trivial clustering of radius [math], in each phase , given is a clustering of expected size (except for where ) with radius at most . The output of the phase is a clustering of expected size and radius , and a subgraphs that takes care of the unclustered vertices in .
We now zoom into the phase and explain the construction of the clustering and the subgraph . The phase is governed by two key parameters: the sampling probability of each cluster to join the clustering , and an augmentation radius . Let and for every , define
[TABLE]
The description of the phase for is as follows:
Each unclustered vertex with , adds to the shortest path to its closest center in . 2. 2.
Let be the collection of clusters obtained by sampling each cluster independently with probability of . 3. 3.
Each cluster such that , joins the sampled cluster in with minimal distance between their centers, and adds the shortest path between their centers into . 4. 4.
The clustering consists of all the sampled clusters in augmented by their nearby clusters in (i.e., at distance between their centers). That is, each cluster in is made of a star of clusters, with the head of the star is a sampled cluster whose center is connected to the centers of a subset of clusters in . 5. 5.
Each cluster such that , adds to the shortest path between its center to any center of with .
This completes the description of the phase. Let and let be the output clustering of the last phase . In the analysis section we show that in expectation consists of at most a single cluster . In the latter case, the algorithm adds to the output spanner , a BFS tree rooted at the centers of the clusters of .
Stretch Analysis.
We start by bounding the radius of the clustering for every . We first make a simple observation:
Observation 2**.**
For every , . In particular, the radius of cluster in the final clustering is .
Proof.
The claim is shown by induction on . For the base case, where , the [math]-level clusters are simply singletons, and each node joins its closest sampled vertex at distance at most . Therefore the clusters in have radius of . Assume that the claim holds up to , and consider the clustering which is defined in the phase based on the clustering . Each cluster in is formed by a star: the head of the star is the sampled cluster that is connected to all clusters with center-distance at most from . The radius of this cluster in is bounded by
[TABLE]
where the second inequality follows by plugging the bound on obtained from the induction assumption and the bound on from Eq. (11). ∎
Observation 3**.**
For , therefore after phases, there is one cluster in in expectation.
Proof.
By induction on . For the base case , since , the number of sampled clusters is in expectation. Assume that the claim holds up to , and consider the clustering. In the th phase, each cluster in is sampled with probability of , therefore, the number of sampled clusters in expectation is . By induction assumption, , and thus . By plugging , we get that in expectation . ∎
Our stretch argument is based on the following definition of clustered vertices.
Definition 2**.**
For every , a vertex is called -unclustered. In addition, every is called clustered (i.e., belongs to the final clustering).
For every -unclustered vertex we provide a stretch guarantee as a function of . Specifically, the earlier that the vertex stops being clustered, the smaller is the ball around for which the stretch guarantee is provided.
Claim 8** (-unclustered).**
For any -unclustered vertex , it holds that for all .
Proof.
For the first phase in step , every -unclustered vertex adds to the spanner its edges to any vertex at distance , the claim follows. ∎
Claim 9** (-unclustered).**
For any -unclustered vertex , for every , and every , it holds that .
Proof.
Consider an -unclustered vertex and . By definition, , thus . Let be closest cluster to with respect to the measure, and let be the cluster of in . Then, the algorithm adds to a shortest path from to the center of in step (1). We have:
[TABLE]
Since becomes unclustered phase , in step the algorithm adds to the shortest path from the center of to the center of . We therefore have:
[TABLE]
∎
In particular, for , by Eq. (11) it holds that . Since the algorithm adds to the spanner, a BFS tree w.r.t the centers of the clusters in , we have that:
Claim 10**.**
Let and let be the shortest path between them in . If there is a clustered vertex then
[TABLE]
Proof.
Since is clustered, its distance from the cluster center is at most . As the algorithm adds the BFS tree of to the spanner, we have:
[TABLE]
the claim follows by plugging the bound on from Obs. 2. ∎
We are now ready to prove Lemma 5.
Proof of Lemma 5.
Let be vertices at some distance in , and let be the shortest path between them in . First observe that by Claim 10, if there is some clustered vertex in then we are done. Assume from now on that no vertex on the path is clustered. We define a sequence of vertices between and in an iterative manner: let , and for every given that , define where:
[TABLE]
Let be the minimal index such that , thus we have defined segments for . By Claim 8, the first case in the definition of causes no stretch. By Claim 9 the second cases causes a multiplicative stretch of , unless it ends in . The latter case can happen only for the last segment, and in this case by Claim 9 we will get an extra additive stretch of , which by Observation 2 is at most . In other words, for every segment such that , we have that . For the last segment we have that . Therefore,
[TABLE]
Size Analysis. Fix a phase . In step each vertex adds to a path of length at most to its closest cluster center in . This adds edges in total by breaking shortest path ties in a consistent manner. In step we add to the paths that connect clusters in to the sampled clusters at distance at most . This also consume at most edges by breaking shortest path ties in a consistent manner. In step we add to paths from clusters that were not sampled and did not join other clusters, to nearby clusters at distance at most . To bound the number of edges added to the spanner in this step, we will need the following:
Lemma 6**.**
The number of shortest paths added to in step is at most in expectation.
Proof.
Let and let be the clusters with center-distance at most from . Note that the centers of each pair of these clusters is at distance at most , thus if one of these clusters is sampled, then all the others will join some sampled cluster. That is, none will take part in the fifth step of the algorithm. Thus the number of shortest paths from a given cluster added in this step is in expectation. ∎
We now bound the total number of edges added in step of phase . In phase there are clusters and each adds to the spanner a shortest path of length at most . Thus by Lemma 6 this adds at most edges in expectation. By plugging the values of as defined in Eq. (11), it follows that for , we get a total of edges. For , the total number of edges is bounded by . By plugging the values of of Observation 2 we get that the expected number of edges in all the subgraphs is bounded by
[TABLE]
The last step adds a constant number of BFS trees, thus contributing edges. Lemma 5 follows. ∎
5 A New Family of Hopsets
In this section we present new construction of hopsets with edges and . The structure of the section is as follows. First, in Subsec. 5.1 we show the construction of hopsets in the simpler regime of . Then, in Subsec. 5.2 we show the high level construction of hopsets for the complementary regime of . Finally, in Sec. 5.3 we also show the construction of hopsets.
5.1 Hopsets for .
This subsection is devoted for proving Theorem 4 of . We show the following:
Theorem 5**.**
For any -vertex weighted graph , integer and such that , one can compute an hopset for and where edges in expectation.
For simplicity we focus on a fixed distance class by considering all vertex pairs at distance . The algorithm is then applied for each of the distance classes. Throughout, when adding edges to the hopset, we set the weight of these edges to . In addition, to distinguish between real -edges and -edges, we refer the latter by hops. The algorithm has a similar structure to Alg. , and the key difference is that we use here the Thorup-Zwick hopsets to handle the sparse case, rather then applying the truncated Baswana-Sen algorithm. Specifically, the first step of the algorithm computes a hopset be applying the algorithm of Thorup and Zwick. Let be the th level of centers, and define as the clusters of weighted radius
[TABLE]
centered at the vertices of . Specifically, every vertex of distance at most from joins the cluster of its closest center in , this vertex is now clustered. In the hopset, we connect each clustered vertex to the center of its cluster.
In the second step, a cluster graph is defined on the clusters of in the exact same manner as in Alg. . That is, any two clusters in are neighbors in if their -center-distance is at most . Note that as before, the cluster graph is unweighted. Letting be the -spanner on , then for every edge , the algorithm adds to the hopset, an hop between the centers of the clusters . This completes the description of the algorithm.
We are now ready to complete the proof of Thm. 5. Throughout, a vertex is clustered if , and unclustered otherwise.
Claim 11**.**
For every unclustered vertex and every vertex , it holds that:
[TABLE]
Proof.
Since is unclustered, we have that . That is, . Let be the minimal index such that , define analogously and let . Assume towards contradiction that . First assume that , then by Fact 2 (i) we have that
[TABLE]
thus leading to a contradiction as is unclustered. Otherwise, assume that , then again by Fact 2(i) we have that
[TABLE]
contradiction again. Thus, we have that and by Fact 2 (ii), as desired.∎
We next complete the proof of Theorem 5.
Proof.
Stretch and Hop-Bound. Fix a pair at distance in , and let be their shortest-path in . First, assume that at most one vertex in is clustered. Partition the path into consecutive segments from to in the following way: denote , and for every inductively define to be the far-most vertex on at distance101010Shortest path distance and not hop-distance. at most from . Observe that it might be the case that is simply , in the case where the edge incident to on the segment segment has weight larger than . Also, let be the vertex incident to on the segment . In the case where , also let .
Observe that for every , if (which happens when ) then it must hold that . Therefore the path is partitioned into at most segments, where the segment is and , where is the index of the last segment that reaches .
For any , and (except maybe for ) . For , by the assumption that has at most one clustered vertex, it holds that at least one of the vertices is unclustered. Thus by Claim 11, it holds that , and consequently using the edge , . In the last segment we might have that , and thus since , by Claim 11 again, we get that . Finally,
[TABLE]
Next, assume that contains at least two clustered vertices. Let be the leftmost and rightmost clustered vertices on , and let such that . Let be the clusters of respectively, and let and be the centers of these clusters. Thus the hopset contains a hop from to and a hop from to . Since it holds that . This in turn implies that . Thus in the hopset we have a path of at most hops from to : one hop from to , then hops from to , and the last hop from to . Since all the clusters in have radius , and since we connect clusters in the cluster-graph if , we have that the distance in between the centers of adjacent clusters in is at most . Therefore, each edge in has a weight at most . Overall since , we have:
[TABLE]
Finally, since each of the segments and contains at most one clustered vertex, by the argument of the first case, there is a path of at most hops from to , and from to that provides a multiplicative stretch of a for each of the segments and . Therefore, we have that
[TABLE]
where the last inequality holds as . Since we assume , the final stretch is bounded by stretch. See Figure 5 for an illustration.
Size. We show that the total number of edges added to is bounded by in expectation. Step (I) adds at most edges by Fact 2(iii). Step (II) adds edges, between possibly each vertex to its closest cluster in . Finally, in Step (III) we add edges to the hopset. Since contains clusters, the spanner contains at most edges. ∎
5.2 Hopsets for .
Finally, we consider hopsets for the complementary regime of , which will complete the proof of Theorem 4. This regime is considerably more involved and bares similarities with the spanner construction of Sec. 3. Specifically, we show:
Theorem 6**.**
For any -vertex weighted graph , integer111111The statement can work for any upon suffering from larger constants. , and , one can compute an hopset for and , of expected size .
To prove the theorem, we will show the following key lemma which restricts attention to a fixed distance class .
Lemma 7**.**
For any -vertex weighted graph , integers and , there is a hopset of expected size such that for every , at distance in , it holds that .
Algorithm .
Fix a distance range . The same procedure will be repeated for every distance range. The algorithm works in three stages. In the first stage it calls Procedure for steps and radius parameter where (see Algorithm 5). This results in a partial hopset , and a clustering of clusters in expectation, each with radius . We say that a vertex is [math]-unclustered if it is not in . By the end of this stage, we will have the guarantee that for every [math]-unclustered vertex and every vertex , the current hopset contains a -hop path from to of length at most .
The second stage applies contains phases of superclustering. In each phase , the procedure runs for steps. We refer to the step of the phase, by step . Step begins with a superclustering . The output of that step is a superclustering along with a collection of hop edges to be added to the hopset.
We say that a vertex is -unclustered if it is in . The analysis shows that for each -unclustered vertex , the edges added to the hopset in the step provides a -hop - path of length to any vertex . The parameter grows at each step but it is bounded by .
At the beginning of the third stage we have a clustering with clusters in expectation, of (weighted) radius at most . First, the algorithm connects each vertex , satisfying that , to its closest center. Next, a cluster graph is defined by letting . Let be an spanner of . For every edge , the algorithm adds an hop between and to the hopset . This completes the high level description of the algorithms.
First Stage: Initial Clustering.
The algorithm starts by applying Procedure for steps with radius parameter where . This results in a clustering and a partial hopset . By the properties of Procedure and the chosen parameters, the clustering has clusters, in expectation, of radius at most . The partial hopset contains edges, in expectation. By the exact same argument as in Claim 11, we have that:
Claim 12**.**
For every [math]-unclustered vertex and every , it holds that:
Middle Stage: superclustering.
For clarity of presentation, throughout we assume that divides , up to factor in the final stretch, this assumption can be made without loss of generality. The middle step consists of applications of Procedure . For clarity of presentation we assume that is an integer, and in Sec. A we describe how to remove this assumption. We refer to each application of this procedure by a phase. In each phase , the input to Procedure is a clustering of radius where . The output of the phase is a clustering of radius , and a hopset that takes care of all vertices that became unclustered in that phase. This output clustering is obtained by applying steps of supercluster growing. As we will see, the clustering procedure will be very similar to Proc. from Sec. 3, only that we add hops rather than shortest paths.
Starting with the trivial superclustering of radius , in the step of phase for , the algorithm is given a superclustering of radius . The algorithm then outputs a superclustering along with a hopset by taking the following steps.
Each unclustered vertex at center-distance at most from centers of adds to a weighted hop to its closest center in , where and for ,
[TABLE] 2. 2.
Let be the collection of superclusters obtained by sampling each supercluster independently with probability of , where . 3. 3.
Each cluster at center-distance at most from joins the supercluster of its closest center. All the vertices in add to a hop to the center of their new supercluster. 4. 4.
The superclustering consists of all sampled superclusters augmented by their nearby clusters (i.e., at center-distance ). The center of each new supercluster is the center of the sampled supercluster. 5. 5.
Each cluster at center-distance larger than from , adds to a weighted hop from its center to the center of any supercluster with .
The parameter is set in a way that guarantees that for every -unclustered vertex and every , the hopset contains a hop path - of length at most .
Let be the output superclustering after steps, then the output clustering is formed by merging all clusters in the supercluster to a single cluster in for every . That is, . Let . This completes the description of the phase. After phases, the output clustering is shown to contain at most clusters in expectation. Let be the current hopset after these phases.
Finalizing Stage: Spanner on Cluster Graph.
Given the collection of clusters in , the algorithm first adds a weighted hop from each unclustered vertex to the center of its closest cluster in up to center-distance , if such exists. Next, a cluster graph is defined by connecting two clusters if their center-distance in is at most . That is, . Let be a -spanner of for . For edge , a weighted hop between the center of and the center of is added to the hopset . This completes the description of the algorithm.
The analysis of this algorithm is very similar to the analysis of the spanner construction from Section 3.
Stretch Analysis.
Recall that , . For let and let be the radius of the clusters in the last clustering at the end of the middle stage. The stretch and size arguments are simil For the sake of the stretch and size analysis, we will need the following two claims, bounding and , respectively. The next claim is the analog of Claim 2 in Section 3:
Claim 13**.**
For , .
Proof.
We show by induction on that
[TABLE]
The base case, is trivial since . By Eq. (8) and the induction assumption,
[TABLE]
By Eq. (9) we then have:
[TABLE]
∎
We next turn to bound the radius of the clustering for every . The next claim is the analog of Claim 3 in Section 3:
Claim 14**.**
For each it holds that , thus .
Proof.
We show the correctness of the claim by induction on . The base case, for follows by the definition of . Assuming the claim holds up to we show the correctness for . Phase begins with clusters with radius at most , and terminates after superclustering steps of Procedure with clusters with radius . We therefore bound . At step of Proc. , we have superclusters of radius . The algorithm then connects clusters of radius , at distance at most from the sampled supercluster. In the step, the radius of the new supercluster is then increased by an additive factor of . Combining this with the bound on of Claim 2, we have:
[TABLE]
Thus plugging :
[TABLE]
The third inequality is valid when . Finally, the final radius is bounded by:
[TABLE]
where the fourth inequality holds for . ∎
Definition 3** (Clustered and Unlcustered Vertices).**
A vertex is* [math]-unclustered if . A vertex is -unclustered if . A vertex is clustered if .*
By Claim 11 we have that for every [math]-unclustered vertex and any . We now consider the remaining vertices.
Claim 15**.**
For , for each -unclustered vertex , for any .
Proof.
Let . In the beginning of the step, the algorithm adds to hops from any vertex at center-distance at most from to the center of its closest supercluster. In particular, since , we add a hop from to the center of some supercluster , and the weight of this hop is at most . Since is unclustered in this step, the algorithm adds to a hop from the center of its cluster to the center of any supercluster which is in and is at center-distance at most from . Since , thus the algorithm adds a hop from (the center of ) to (the center of the supercluster ) of weight at most
[TABLE]
and consequently, there is a -hop from to through and of length at most:
[TABLE]
where the second inequality follows from Eq. (5.2). We show by induction on that The base case holds trivially. Assuming the correctness of the claim up to we have:
[TABLE]
We therefore have that,
[TABLE]
Figure 3, though illustrating claims regarding spanners, can be used to illustrate the proofs of Claims 15 and 16 as well. ∎
Claim 16**.**
Let be vertices at distance in , and let be a shortest path between them. If there is some clustered vertex , then .
Proof.
Let be the cluster to which belongs. In the beginning of the third stage of the algorithm we add hops from unclustered vertices at center-distance at most to their closest cluster center. Thus since and , it holds that both have hops to centers of some clusters . Since
[TABLE]
it holds that , thus . Since each edge translates into a hop in between the centers of of weight at most , we have the following,
[TABLE]
where the last inequality follows by plugging the bound on the final radius from Claim 14. See Figure LABEL:fig:SecondHopsetAnalysis(b) for an illustration. ∎
Proof of Theorem 6.
Stretch and Hop-Bound. Let be vertices at distance in , and let be some shortest path between them in . First observe that if there is some clustered vertex the claim follows from Claim 16. Assume from now on that there is no such vertex. Partition the path into consecutive segments in the following way: denote and inductively, given define to be the furthest vertex on at distance at most from , where:
[TABLE]
Note that might be equal to if the incident edge to on has weight larger than . In addition, for each let be the consecutive neighbor of on . When , simply let . Let be the minimal index such that . This defines a partition of to segments by setting for all , and is the last segment that reaches . Note that for every segment (except at most the last one) . If is [math]-unclustered this is clear, and if is -unclustered, then by Eq. (8) it holds that . Thus we have that . For any , if is [math]-unclustered, by Claim 11, it holds that:
[TABLE]
If is -unclustered then since , by Claim 15, it holds that . There are two cases to consider. First assume that . In such a case . Next, assume that . By the definition of the segment, in such a case it must hold that . Thus by proof of Claim 15 and Eq. (5.2),
[TABLE]
Therefore by summing over all segments, we get that
[TABLE]
Size Analysis.
By Fact 2(iii), it holds that . In the same manner as shown in Claim 6 it holds that for any , in step we add hops. It follows that for all , in expectation . Since there are phases we have hops in total. As shown in Claim 7, in expectation. Consequently, in expectation. Since each edge in translates into a single hop, this step contributes hops. Overall, , in expectation. ∎ Theorem 6 follows by applying the algorithm for each of the distance classes.
5.3 New Hopset
In this subsection we show a considerably simplified construction of hopsets. For example for , we get a hopset. For the sake of the efficient implementation in Sec. 6.2, we settle for a slightly worse value of . In Appendix B.2, we show an improved construction that achieves the bounds of Lemma 2. Our main result in this section is as follows:
Lemma 8**.**
For any -vertex weighted graph , integer and , one can compute a hopset where of expected size .
Algorithm Description.
For simplicity we fix a distance range . Lemma 8 follows by taking care of all ranges. Furthermore, for simplicity we assume throughout that is an integer. The algorithm has two stages. In the first stage it calls Procedure (from Sec. 5.1) for a single iteration with a radius parameter
[TABLE]
For completeness, and due to its simplicity we add a complete description of this single iteration. The procedure samples each vertex into a subset with probability . For each let its closest vertex in . For each sampled vertex define its cluster
[TABLE]
The [math]-level clustering is given by . Finally, add to the hopset , the hops for every . In addition, for every unclustered vertex add to , the hop to each vertex satisfying that . The procedure outputs the clustering with clusters, and a partial hopset of size , in expectation.
The second stage has clustering phases. Starting with , in each phase , given is a clustering of expected size and with radius at most
[TABLE]
The output of the phase is a clustering of expected size , and a hopset that takes care of the unclustered vertices in .
We now zoom into phase and explain the construction of the clustering and the hopset . The phase is governed by two key parameters: the sampling probability of each cluster and the augmentation radius . Similarly to the algorithm of Section 4, for every , define
[TABLE]
The description of the phase for every is as follows:
Each unclustered vertex with adds to a hop to its closest cluster center in . 2. 2.
Let be the collection of clusters obtained by sampling each cluster independently with probability of . 3. 3.
Each cluster such that , joins the its closest cluster in (based on the c-dist measure). All the vertices in add to , a hop to the center of their new cluster. 4. 4.
The clustering consists of all sampled clusters in augmented by their nearby clusters in (i.e., at center-distance at most ). 5. 5.
The center of each cluster such that , adds to a hop to the center of any cluster with .
This completes the description of the phase. Let and let be the output clustering of the last phase . In the analysis section we show that contains at most a one cluster , in expectation . We then add the output hopset , a hop from each vertex at center-distance at most from to the center of the cluster121212For the sake of the efficient implementation of Sec. 6.2, we connect only vertices up to center-distance to the center of the last cluster, even-though with respect to the size of the hopset, we can afford ourselves to connect the center of to all nodes..
Stretch Analysis.
For the sake of the stretch analysis, we use the following definition:
Definition 4**.**
A vertex is [math]-unclustered if . For any , a vertex is -unclustered if it is in . Finally, a vertex is clustered if it is in the last cluster.
We first make a simple observation:
Observation 4**.**
For every , . In particular, the radius of cluster in the final clustering is .
Proof.
The claim is shown by induction on . The base case is trivial. Assume that the claim holds up to , and consider the where the clustering is defined based on the clustering . Each cluster in is formed by a star: the head of the star is the sampled cluster , connected to other clusters with center-distance at most . The radius of this star of clusters is bounded by:
[TABLE]
where the second inequality follows by plugging the bound on obtained from the induction assumption, and the bound on from Eq. (11). ∎
Observation 5**.**
For , in expectation therefore after phases, there is at most one cluster in , in expectation.
Proof.
By induction on . For the base case consider . In the first stage the algorithm samples each vertex with probability , thus we have clusters in expectation. Assuming that claim holds up to , in phase , each cluster in is sampled independently with probability . Thus, in expectation. ∎
Claim 17** ([math]-unclustered).**
For any [math]-unclustered vertex , it holds that for all
Proof.
If is an [math]-unclustered vertex, then . Since we add hops from to any vertex at distance at most , it holds that contains a hop to any . ∎
Claim 18** (-unclustered).**
For any -unclustered vertex for and every , it holds that .
Proof.
Consider an -unclustered vertex and . Let be the cluster of . Thus . Let be the cluster with the closest center to in , then the algorithm adds to a hop from to the center of . We have that:
[TABLE]
Since becomes unclustered in phase , the algorithm adds to a hop from the center of , to the center of . Overall, we have the following -hop - path: , where are the centers of respectively. The length of this path is bounded by:
[TABLE]
∎
Claim 19**.**
Let be a pair of vertices at distance in , and let be their shortest path in . If there is a clustered vertex then .
Proof.
Since is clustered, it follows that in the last step we add hops from and to the center of the last cluster, therefore:
[TABLE]
where the last inequality follows by Observation 4. ∎
We are now ready to complete the stretch argument and show Lemma 8.
Lemma 8.
Let be vertices at distance in , and let be the shortest path between them in . First observe that by Claim 19, if there is some clustered vertex in then we are done, thus we assume there is none. We define a sequence of vertices between and in the following way: let , and iteratively, given , set as the furthest vertex from on the segment , at distance at most from , where:
[TABLE]
Furthermore, for each , set to be the be the next vertex on , or just in case . By Claim 17, if is [math]-unclustered then . If is -unclustered for , then there are two cases. First consider the case where . In this case by Claim 18, it holds that , thus since , we have that,
[TABLE]
Now, assume that , that is . By Claim 18 it holds that
. By Observation 4, , an therefore, we get that
[TABLE]
Since the path is partitioned into at most segments, we have:
[TABLE]
∎
Lemma 8 follows by using , and repeating the algorithm for each of the distance ranges.
Size Analysis.
We next bound the total number of hops added to the hopset. The first step contributes edges, in expectation. This follows by Fact 2(iii). At each phase the algorithm adds the following hops to . Each vertex at center-distance at most adds a hop to its closest cluster center in . This adds at most hops. In addition, in step (3), we add at most hops from each vertex to its new cluster center.
By a similar argument to the proof of Lemma 6, it follows that in step (5) of the phase, the algorithm adds hops for each cluster in expectation, where are defined in Eq. 11. Summing over all clusters, this adds hops by Eq. (11), in expectation. Overall in phase , edges are added to the hopset , and summing over all phases gives a total of edges in expectation. Finally, the last step adds at most edges, this completes proof of Lemma 8.
6 Efficient Computation of Spanners, Hopsets, and Applications
6.1 Efficient Constructions of Spanners and Applications
In this section, we provide efficient implementation of spanners in various computational settings, and show their applications to shortest path computation.
A Modified Meta-Algorithm.
The algorithm is similar to the algorithm of Section 4 up to modifying the number of phases and the sampling probability of Eq. (11). As in [EN19a], we introduce an efficiency parameter that determines the trade-off between the value, the number of edges in the spanner, and the construction time. For a given parameter , define:
[TABLE]
The modified algorithm applies phases instead of phases. The sampling probabilities are modified as follows: for every , let be as defined in Eq. (11), and for every , set . We first show the correctness of this modified algorithm and then analyze its implementation in several computational settings.
Observation 6**.**
For all it holds that .
Proof.
For each , by Eq. (11) we have , thus for all it holds that . By Observation 3 it holds that in expectation , thus . ∎
Lemma 9**.**
In each phase consider the collection of BFS trees of depth rooted at the centers of the clusters of that are unclustered in . Then each vertex appears in such trees w.h.p.
Proof.
Fix a vertex , and let be the clusters in with center-distance at most from . Note that for every where is the center of the cluster . This implies that if one of these clusters in sampled, the all these clusters will be part of the clustering and would not be part of step (5) in this phase. Therefore all clusters the expected number of BFS traversals that reach in step (5) of the phase is at most . Since by Observation 6 for all , , we have that each vertex is traversed by BFS traversals in expectation, and by the Chernoff Bound, w.h.p, each vertex is traversed by at most traversals. ∎
Lemma 10**.**
After phases, the number of remaining clusters is at most w.h.p.
Proof.
Until the phase the algorithm runs similarly to the algorithm in Section 4, thus by Observation 3, in expectation . In each of the remaining phases we sample with probability which by Observation 6 is at least , thus after such sampling steps, we will have in expectation. By Chernoff it follows that w.h.p . ∎
Observation 7**.**
The modified algorithm computes a spanner with and edges w.h.p.
Proof.
The final radius after phases is given by plugging in Observation 2:
[TABLE]
Since we only modified the sampling probabilities of the spanner in 4, it follows that the stretch arguments are unchanged, except that the final radius is larger . Thus by Claim 10, we get that for every it holds that .
We now bound the size of the spanner. The first phase, adds edges, in expectation, and in any subsequent phase, the algorithm adds shortest paths in expectation, each of length at most . Thus overall, this adds edges. The last step adds edges due to adding a constant number of BFS trees. ∎
6.1.1 The Centralized Setting
The trade-off between the value, the spanner size and the running time of the algorithm is summarized below:
Lemma 11**.**
For any graph , integer and any , one can compute a spanner for with edges and time.
Proof.
The algorithm has phases. In each phase there are five steps that are implemented as follows. In step (1), we add shortest paths of length from unclustered vertices to their closest center. Since each vertex connected to at most its closest center, while breaking ties in a consistent manner this can be done in time. In the same manner, also step (3) can be implemented in time. Finally, we consider the fifth phase where we grow BFS trees from all centers that did not join the clustering . By Lemma 9, each vertex appears in at most trees, and thus these trees can be computed in time. The overall running time is then bounded by as desired. ∎
By computing efficiently the spanners, we also get the following fast computation of the distances. The next is Corollary 19 of [EN19a] while enjoying a better tradeoff in the expense of increasing the multiplicative stretch from to :
Corollary 1**.**
There exists an algorithm that computes for any graph , integer , any parameters and any vertex set , a approximate shortest paths for , for in time.
6.1.2 The Distributed Setting
The Model.
We now consider the implementation details of our spanner construction in the standard model [Pel00]. In this model, the algorithm’s execution proceeds in synchronous rounds, and in every round, each node can send a message (possibly of unbounded size) to each of its neighbors. Each node holds a processor with a unique and arbitrary ID of bits.
One of the key effects of improving the value of the in our spanner is that we can compute our spanner in rounds, hence for , in rounds. This should be compared against the local computation of spanners in rounds.
Lemma 12**.**
For any graph , integer and , one can compute in the model a spanner for in rounds w.h.p.
The implementation is exactly as in Section 4 with two modifications. First, we will now make all the arguments hold with high probability of for some constant , rather than in expectation. Specifically, in last step of the algorithm there are now clusters in , w.h.p. Instead of adding a BFS tree w.r.t to each center, we will add a truncated BFS tree up to depth from each of the centers. We now show that this slightly increases the stretch of the spanner, by proving the analogue of Cl. 10:
Claim 20**.**
Fix a pair and let be their shortest path in . If there is a clustered vertex , then:
[TABLE]
Proof.
First assume that . Let be some clustered vertex on , and let be the center of the cluster to which belongs. In this case since , it holds that , hence and . Consequently, . Since we changed only the last step, it implies that for any , we have that for vertices at distance in , . For with it holds that:
[TABLE]
∎
As all the steps of the algorithms are now restricted to the -ball of each vertex, therefore Lemma 12 follows.
The Model.
We next consider the implementation details of our spanner construction in the standard model [Pel00]. This model is exactly as the only that in each round, a vertex is limited to send bits on each of its incident edges.
The implementation in the model follow the same line of the meta-algorithm, only that in the last step we build a truncted BFS tree up to depth as in the local implementation. We have:
Lemma 13**.**
For any graph , integer and any , one can compute in the model a spanner for in rounds w.h.p.
Proof.
There are phases. We will fix phase and show it can implemented in rounds. Steps (1) and (3) are based on congestion-free BFS computation up to depth . Step (5) builds a collection of BFS trees up to depth . By Lemma 9, each vertex is traversed by trees. Computing a collection of BFS trees up to depth with edge congestion can be done in rounds w.h.p using the random delay approach. Therefore by summing over all phases we get rounds.
Finally, in the last step we have centers in w.h.p. Computing a depth -trees from each center can be done in rounds. The time analysis follows. ∎
6.1.3 The Multi-Pass Streaming Setting
Model.
In the streaming model the input graph is presented to the algorithm edge by edge as a stream without repetitions and the goal is to solve the problem while minimizing the number of passes and space. For graph algorithms, the usual assumption is that the edges of the input graph are presented to the algorithm in arbitrary order. The next is Corollary 20 of [EN19a] while enjoying a better tradeoff in the expense of increasing the multiplicative stretch from to :
Lemma 14**.**
For any -vertex unweighted graph , integer and and , one can compute in the multi-pass streaming model a spanner for in space w.h.p and passes, or with space in expectation and passes w.h.p.
Proof.
We will use two alternative implementations in the streaming model, in a very similar way to Theorem 5 in [EN19a]. In both implementations, for the BFS traversals the algorithm keeps for each traversed vertex the ID of its parent, the ID and of the root of the BFS tree, and it distance to the the root. The first implementation is very similar to the implementation that we described for the model. In this implementation, we compute only truncated BFS trees up to depth , rather then a complete BSF tree. A truncated BFS traversal up to depth can be implemented in passes, thus overall the algorithm can be implemented in passes. Since each vertex is visited by BFS trees w.h.p, the total space used is bounded by space. We now consider the alternative implementation. To reduce the BFS congestion of in the fifth step, this step is divided into sub-steps. In each sub-step, we will sample each of the remaining centers (from which we would like to compute the BFS traversal) independently with probability of . We will then compute the truncated BFS traversal only from the centers that got sampled in this sub-step. By the Chernoff bound, w.h.p., each vertex will be visited by traversal, hence a space of plus the space of the spanner is sufficient for the implementation. After sub-steps, w.h.p., the algorithm has computed the truncated BFS-traversal from each of the cluster centers. The total number of passes is bounded by . ∎
Lemma 3 follows immediately by Lemma 14.
6.2 Efficient Constructions of Hopsets
In this section we show an efficient construction of hopsets. We use the following fact that follows from the proof of Theorem 1.1 in [TZ05]:
Fact 3**.**
Each of the clustering steps in the distance oracle algorithm by Thorup and Zwick can be implemented in centralized time.
The Meta Algorithm.
The algorithm is similar to the algorithm in the proof of Lemma 8 up to modifying the number of phases and the sampling probability of Eq. (11) in the exact same manner as in Section 6.1. In addition, since we are now working with weighted graphs, we will be using the Dijkstra algorithm to compute shortest path trees, instead of BFS traversals. Fix a distance class and define . We then slightly change the initial radius of the clustering to be . By similar arguments as in the proof of Lemma 8 and Observation 7, the final radius after phases is:
[TABLE]
The remaining details are almost identical to the meta-algorithm for spanners so we only state the properties of this construction:
Observation 8**.**
The modified algorithm computes a hopset with and edges in expectation.
Efficient Implementation in the Centralized Setting.
The tradeoff between the value, the hopset size and the running time of the algorithm is summarized below:
Lemma 15**.**
For any graph , integer and any , one can compute a hopset with with edges and time in expectation.
Proof.
The stretch and size arguments are almost similar to those in Section 6.1, hence we restrict attention to the running time. Our algorithm begins with a single clustering step of the Throup and Zwick’s algorithm (i.e., computing the first bunch for each vertex ). The output of this step is a clustering . Since clusters are vertex-disjoint, one can compute them in time. Thus using Fact 3 the entire first part of the algorithm can be implemented in centralized time. From this point onward the analysis is similar to the analysis of the centralized implementation of spanners thus requiring centralized time. ∎
Appendix A Complete Proofs of Theorems 3 and 6
Recall that and . We will now handle the case where is not an integer as assumed in Sections 3 and 5.2, thus completing the proofs of Theorems 3 and 6. We will focus on the spanner case of Theorem 3. The exact same argument holds for the hopsets for Theorem 6. By Claim 3, when is an integer we have a final clustering with radius . Let and . We divide the treatment of the fractional case into two possible cases. In the first case, assume that . In this case, the middle stage of Alg. simply contains phases, each of steps. In this case we have:
Claim 21**.**
For , it holds that the final clustering has clusters, in expectation, and the final radius .
Proof.
Since it follows from Claim 7 that the final clustering has clusters, in expectation. In Claim 3 we bound the radius of the final clustering in the case where is an integer by , thus if , by the same claim, we get that the final radius after phases is at most:
[TABLE]
∎
In the complementary case where , the adaptation of Alg. is as follows: In the second stage of the algorithm, it applies clustering phases as usual (i.e. each with steps), and the last phase will consist of only steps. We will denote this last phase as a fractional phase. Again, we will show that after these phases, it holds that the number of clusters in the final clustering is in expectation and that the radius of this clustering is . We begin by bounding the number of clusters in the final clustering :
Claim 22**.**
After phases of steps and one last phase of steps, the final clustering has clusters, in expectation.
Proof.
By Claim 7 after phases,
[TABLE]
in expectation. By the same claim it follows that after additional steps of Proc. , the expected number of clusters in the final clustering is:
[TABLE]
∎
We continue to bound the radius of the final clustering :
Claim 23**.**
.
Proof.
By Claim 3 it holds that after phases the radius of the clustering is bounded by . By the same claim after additional steps of Procedure the radius is bounded by:
[TABLE]
where the third inequality following since for any :
[TABLE]
and in particular since by assumption , we have:
[TABLE]
where the last inequality follows since . We conclude . ∎
We proceed with providing stretch and size arguments.
Claim 24**.**
For any fixed distance , it holds that Algorithm outputs a subgraph such that for any with it holds that .
Proof.
Fix a distance . By Claim 1 we have that for any [math]-unclustered vertex and any it holds that .
Furthermore, since the definition of the parameters in unchanged, by Claim 4, for any -unclustered vertex and any it holds that . In particular, . Thus by the exact same argument as in the proof of Lemma 4 only with instead of , it holds that for any with , if all the vertices on the shortest path between and in are unclustered, then .
It remains to prove the analogue of Claim 5, that is, providing the stretch argument for the case where the - shortest contains at least one clustered vertex . By a similar arguments as in Claim 5, we have that in this case . Thus plugging the value of from Claims 21 and 23, we get that
[TABLE]
∎
Claim 25**.**
For any fixed distance it holds that the algorithm outputs a subgraph of expected size .
Proof.
First note that up to the last phases of Procedure , the algorithm is similar to the integral case, thus by the analysis of Lemma 4, until this step edges are added to the spanner, in expectation.
We now bound the number of remaining edges added to the spanner in the last phase. By Claim 6 it holds that in each step of the last phase we add shortest paths in expectation. In step of the last phase the added shortest paths are of length at most , and by Eq. (3) it holds that . In the first case, where , by Claim 21 it holds that we add an extra of edges in expectation. In the complementary case we add edges in expectation. Thus over all the middle stage in the fractional case adds an extra of edges in expectation.
In the final stage we construct a spanner of the cluster graph. By Claims 21 and 22, the expected size of the final clustering is , thus as in the proof of Lemma 4 it holds that the size of the spanner of the cluster graph is . Since each edge in this spanner is translated into a path of length in the final spanner, it holds that this stage adds edges to the spanner. We conclude that the construction in the fractional case is of size . ∎
Finally by Observation 1, we conclude that taking the output subgraphs of Algorithm for every yields a -spanner of of expected size at most . This completes the proof of the fractional case of Theorem 3.
Hopsets.
As in the spanner case we divide the treatment of the fractional case to two cases as explained above. In the first case, where , the algorithm applies standard phases of Procedure , each with steps. In the complementary case, the algorithm applies standard phases of the procedure, and a final fractional phase of steps. In Claim 14 we show that in the case where is an integer, the final radius is bounded by . By similar claims to Claim 21 and 23, it holds that the final radius in the fractional case is at most . Since we have some slackness in the stretch arguments of Claim 16 and in the proof of Thm. 6, the same bound holds when plugging the bound on the final radius instead of .
For the size analysis, the algorithm is the same of that in Section 5.2 until the end of the th phase. Therefore, by the proof of Thm. 6 until this last phase, are added to the hopset. By a similar argument to Claim 6, the last phase with steps adds hops to the hopset, in expectation. By the exact same argument as in Claim 22, in the third stage of the algorithm there are clusters, in expectation. Consequently as in the proof of Thm. 6, the spanner of the cluster graph is of size , thus this stage contributes hops to the hopset. We conclude that altogether the hopset in the fractional case is of size . Lemma 7 follows.
Appendix B Improved Spanners and Hopsets
B.1 Spanners
In the following subsection we state and sketch the construction of a -spanner with an improved and provide the proof for Lemma 1. For example, we show a spanner for . The algorithm is similar to the algorithm of Section 4, with the only differences that we set , and in step (3) of the phase of the algorithm, the sampled clusters add to paths between their centers to centers at distance at most (instead of ). This change affects the radii of the clusters in the following way:
Observation 9**.**
For every , . In particular, the radius of cluster in the final clustering is .
Proof.
The proof is similar to the proof of Observation 2, with the difference that , and at each phase each new cluster is a star of a cluster that connects to other clusters with center-distance at most , thus the radius of the new cluster is at most . ∎
As in the proof of Lemma 5, the size of is bounded by:
[TABLE]
Following the proof of Lemma 5, we have that is a -spanner with , and Lemma 1 follows.
B.2 Hopsets
In the following subsection we state and sketch the construction of a -spanner with an improved and provide the proof for Lemma 2. For example, we show a hopset for . The algorithm is similar to the algorithm of Section 5.3, with the only differences that we set and in step (3) of the algorithm, the sampled clusters add to hops to cluster centers at distance (instead of ). This change affects only the radius of the clusters, and in the following way:
Observation 10**.**
For every , . In particular, the radius of cluster in the final clustering is .
Proof.
We set , and the rest follows by the same argument as in Observation 9. ∎
As in the proof of Lemma 8, the size of a hopset for a fixed distance range is bounded by
[TABLE]
Thus the expected size of the hopset is . Furthermore, as in the proof of Lemma 8, and , by plugging , Lemma 2 follows.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AB 17] Amir Abboud and Greg Bodwin. The 4/3 additive spanner exponent is tight. Journal of the ACM (JACM) , 64(4):28, 2017.
- 2[ABP 18a] Amir Abboud, Greg Bodwin, and Seth Pettie. A hierarchy of lower bounds for sublinear additive spanners. SIAM Journal on Computing , 47(6):2203–2236, 2018.
- 3[ABP 18b] Amir Abboud, Greg Bodwin, and Seth Pettie. A hierarchy of lower bounds for sublinear additive spanners. SIAM J. Comput. , 47(6):2203–2236, 2018.
- 4[ACIM 99] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing , 28(4):1167–1181, 1999.
- 5[ADD + 93a] Ingo Althöfer, Gautam Das, David Dobkin, Deborah Joseph, and José Soares. On sparse spanners of weighted graphs. Discrete & Computational Geometry , 9(1):81–100, 1993.
- 6[ADD + 93b] Ingo Althöfer, Gautam Das, David P. Dobkin, Deborah Joseph, and José Soares. On sparse spanners of weighted graphs. Discrete & Computational Geometry , 9:81–100, 1993.
- 7[BKMP 05] Surender Baswana, Telikepalli Kavitha, Kurt Mehlhorn, and Seth Pettie. New constructions of ( α 𝛼 \alpha , β 𝛽 \beta )-spanners and purely additive spanners. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms , pages 672–681. Society for Industrial and Applied Mathematics, 2005.
- 8[BS 07] Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Struct. Algorithms , 30(4):532–563, 2007.
