Additive Spanners and Distance Oracles in Quadratic Time
Mathias B{\ae}k Tejs Knudsen

TL;DR
This paper presents a deterministic algorithm for constructing small additive spanners and improved distance oracles in quadratic time, enhancing previous randomized methods with more efficient and reliable solutions for approximate shortest path computations.
Contribution
It introduces a deterministic, quadratic-time algorithm for additive spanners and improves the construction of approximate distance oracles, reducing randomness and complexity compared to prior work.
Findings
Deterministic construction of an additive O(1)-spanner with O(n^{4/3}) edges in O(n^2) time.
Development of a Las Vegas (2,1)-distance oracle of size O(n^{5/3}) in O(n^2) time.
Enhanced algorithms for approximate all-pairs shortest paths with improved efficiency.
Abstract
Let be an unweighted, undirected graph. An additive -spanner of is a subgraph that approximates all distances between pairs of nodes up to an additive error of , that is, it satisfies for all nodes , where is the shortest path distance. We give a deterministic algorithm that constructs an additive -spanner with edges in time. This should be compared with the randomized Monte Carlo algorithm by Woodruff [ICALP 2010] giving an additive -spanner with edges in expected time . An -approximate distance oracle for is a data structure that supports the following distance queries between pairs of nodes in . Given two nodes , it can in constant time compute a distance estimateβ¦
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Additive Spanners and Distance Oracles in Quadratic Time
Mathias Bæk Tejs Knudsen
Research partly supported by Advanced Grant DFF-0602-02499B from the Danish Council for Independent Research under the Sapere Aude research career programme and by the FNU project AlgoDisc - Discrete Mathematics, Algorithms, and Data Structures University of Copenhagen,
Abstract
Let be an unweighted, undirected graph. An additive -spanner of is a subgraph that approximates all distances between pairs of nodes up to an additive error of , that is, it satisfies for all nodes , where is the shortest path distance. We give a deterministic algorithm that constructs an additive -spanner with edges in time. This should be compared with the randomized Monte Carlo algorithm by Woodruff [ICALP 2010] giving an additive -spanner with edges in expected time .
An -approximate distance oracle for is a data structure that supports the following distance queries between pairs of nodes in . Given two nodes , it can in constant time compute a distance estimate that satisfies where is the distance between and in . Sommer [ICALP 2016] gave a randomized Monte Carlo -distance oracle of size in expected time . As an application of the additive -spanner we improve the construction by Sommer [ICALP 2016] and give a Las Vegas -distance oracle of size in time . This also implies an algorithm that in gives approximate distance for all pairs of nodes in improving on the algorithm by Baswana and Kavitha [SICOMP 2010].
1 Introduction
Let be an unweighted, undirected graph on nodes and edges. A subgraph of is an additive -spanner if the following holds for every pair of nodes in :
[TABLE]
where and is the distance between and in and respectively. This paper will only consider additive spanners and not multiplicative or mixed spanners, so we will simply say that is a -spanner when we mean that is an additive -spanner.
In this paper we consider algorithms constructing -spanners, and there are therefore three interesting parameters: The distortion , the running time of the algorithm, and the size of the spanner created. Elkin and Peleg [19] showed how to construct -spanners with edges in time, and Baswana et al [9] gave an algorithm that constructs -spanners with edges in time.
The running time of these algorithms can be improved if we allow the -spanners to be larger by a factor. Dor, Halperin and Zwick [18] showed that we can construct -spanners with edges in time, and Woodruff [33] gave an algorithm to construct -spanners with edges in time. The construction of Woodruff is furthermore randomized Monte Carlo. These results are summarized in Table 1.
These improvements to the running time fit into the following paradigm: For a fixed the authors find algorithms that produce spanners that are almost as small as the best known construction of -spanners and have near-quadratic running time. We reverse this way of looking at the problem. We are now trying to find algorithms that yield -spanners that are exactly as small as the best known constructions for any , i.e. , and at the same time we want the algorithm to run as fast as possible. All known algorithms for creating -spanners that have close to optimal size run in time . 111For instance the algorithm by Baswana et al [9] gives a -spanner with edges and is therefore only interesting when , in which case the running time is . So a natural question is to ask if there exists a and an algorithm that constructs a -spanner with edges in time. In fact Sommer [28] mentioned at his talk at ICALP 2016 that the main obstacle towards getting a better running time for constructing the distance oracle he presented is the lack of such an algorithm. In his case the distortion is only factored into the running time and not the distortion of oracle. Therefore, it does not matter what is as long as it is constant.
We show that it possible to attain this goal by giving an algorithm that constructs -spanners deterministically with edges in time. Comparing this with the algorithm by Woodruff [33] this gets rid of the factor on the number of edges and a factor of in the running time. Furthermore, the algorithm is deterministic and not randomized Monte Carlo. The price of these improvements is that the distortion is larger than . We note that there are no lower bounds ruling out the possibility of a -spanner with edges. For the application to the distance oracle by Sommer [28], the distortion is unimportant as long as it is constant. We also show how to construct -spanners with edges in time. For a comparison to previous work see Table 1.
Related work
Elkin and Peleg [19] showed that222 Aingworth et al [5] earlier showed the same result up to logarithmic factors on the size of the spanner. any graph on nodes has a -spanner with edges, Chechik [15] showed that it has a -spanner with edges, and Baswana et al [9] showed that it has a -spanner with edges. These results are complemented by a negative result of Abboud and Bodwin [1]. A consequence of their result is that for any there exists a graph on nodes such that any -spanner of this graph has at least edges.
Another negative result comes from ErdΕsβs girth conjecture [20]. It states that for any constant there exists graphs with nodes and edges where the girth is . This conjecture has been proved for [31, 11]. In particular if the conjecture is true this implies that there exists graphs for which any -spanner must have at least edges. Woodruff [32] proved that whether the conjecture is true or not, there exists a graph on nodes such that any -spanner of the graph has at least edges.
There are also upper and lower bounds when we allow the distortion to depend on , see [14, 13, 15, 22]. In this paper, however, we are only interested in the case where . The upper and lower bounds for are summarized in Table 2.
Techniques
Previous algorithms that construct -spanners in time all relied on constructing a hitting set for some set of neighbourhoods. In [18] this is done deterministically via a dominating set algorithm, and in [33] this is done via sampling. This approach will inherently come with the cost of a factor. Furthermore, in the construction of -spanners by Woodruff [33] the number of neighbourhoods that need to be hit is so large that it seems impossible with current techniques to modify the algorithm to be Las Vegas. Too avoid this we instead use a clustering approach described in Section 2. The algorithm in Theorem 2 is obtained using this clustering and a careful modification of the path-buying algorithm of [9].
Approximate Distance Oracles and All Pairs Almost Shortest Paths
Given an undirected an unweighted graph an -approximate distance oracle for is a data structure that supports the following query. Given two nodes , it can compute a distance estimate that satisfies where is the distance between and in . For work on approximate distance oracles see e.g. [2, 3, 4, 6, 7, 8, 10, 12, 16, 17, 23, 24, 26, 27, 29, 30, 34]. Sommer [28] gave a randomized Monte Carlo -distance oracle that can be constructed in time, has size and can answer queries in time. We improve the construction time and the size to and respectively, and our construction is randomized Las Vegas. As a corollary we can compute an estimate for all pairs of nodes in satisfying in time . This improves upon the algorithm by Baswana and Kavitha [8].
Preliminaries
For a graph and two nodes we denote the distance from to in by . All graphs considered in this paper are unweighted, and unless otherwise specified they are undirected as well. For an undirected graph an a node the neighbourhood of is the set of nodes adjacent to and is denoted by .
Overview
In Section 2 we introduce the clustering we use when constructing the spanners. In Section 3 we show how to create an -spanner with edges in time and thereby prove Theorem 2. In Section 4 we provide the details on how to give an improved -distance oracle.
2 Clustering
Our construction of additive spanners uses standard clustering techniques. We present our clustering framework below. Let be a graph with vertices and edges. We let be a parameter that can depend on . For a sequence of nodes we define the clusters by
[TABLE]
Furthermore we also define graphs in the following way. We let , and for we let be the subgraph of that contains an edge if not both and are contained in . From each node we let be a BFS tree in rooted at .
Definition 1**.**
A sequence is called a -clustering if the following requirements are satisfied.
- β’
The node maximizes .
- β’
Every cluster contains at least nodes.
- β’
For every node we have .
We say that a node is clustered if and unclustered otherwise. We note that since every cluster contains at least nodes and the clusters are disjoint we have .
Lemma 1**.**
Let be a -clustering. Then the number of edges in is at most .
Proof.
The number of edges in is bounded by the sum , which is clearly less than . β
Lemma 2**.**
Let be a -clustering of and let be a pair of nodes. Assume that some shortest path from to in is not contained in from Lemma 1. Then there exists an index such that
[TABLE]
Proof.
Consider a shortest path from to that is not contained in and let be a clustered node on such that . We choose such that is smallest possible. By choosing smallest possible is contained in . Furthermore since the distance from to is at most we see that
[TABLE]
Since is a is shortest path tree in the conclusion follows. β
Lemma 3**.**
Given a graph and a parameter we can construct a -clustering , the corresponding BFS trees and in time.
Proof.
The algorithm will work by finding the nodes consecutively, i.e. first , then and so on. The algorithm will maintain a graph . In the beginning of the algorithm we have , and after we add we will alter such that . The total cost of altering all will be .
We find by looking at all nodes in and count the number of neighbours not in . Since has at most edges this takes time. Then the algorithm finds a BFS tree from in in time. Hence the total time used by the algorithm is:
[TABLE]
β
3 Constructing -Spanners
In this section we present our construction of an -spanner with edges in time. As a warmup we show how we can use the clustering from Section 2 to give a -spanner with edges in time.
Theorem 1**.**
There exists an algorithm that given a graph with nodes constructs a -spanner of with edges in time.
Proof.
Let and construct a -clustering with Lemma 3. Let . The number of edges in is at most by Lemma 1 and the fact that .
Now we just need to prove that is a -spanner. Let be arbitrary nodes and let be a shortest path from to in . We wish to prove that
[TABLE]
If is contained in then (1) is obviously true. Otherwise there exists an index such that by Lemma 2, and (1) is true since . β
Next we turn to showing how to create an -spanner with edges in time. The idea is the following. We start by creating a -clustering with and . Using the BFS trees along with Lemma 2 we can then get an additive -approximation of for all pairs of indices , which we will call . The calculation of the BFS trees in time relies on an idea similar to one in [5]. The BFS trees also gives us a path from to that is at most longer than the shortest path. If we add all these shortest paths to our spanner along with and the neighbours in of each we will get a -spanner. Unfortunately, adding a path could require adding up to edges, and since there are pairs we can only guarantee that the spanner has edges, which is if . (We only need to add edges on the path that are not already in ) Instead we use an argument similar to the path-buying argument from [9] and the construction from [21]. We add the path from to unless we can guarantee that there is already an additive -approximation of this path in the spanner already. We do this by maintaining an upper bound on the distance from to in the spanner . We then argue that if we add a path of with edges not already in the spanner, then there are pairs for which the upper bound is improved. Then, this will imply that at most edges are added giving an upper bound of on the number of edges in .
After this informal discussion of the construction we turn to the details. The algorithm is given a graph with nodes and edges, and will return a spanner . Initially and we will add edges to such that becomes a -spanner of . The algorithm starts by creating a -clustering with using Lemma 3 in time. Since we have . Then we add edges from to all nodes in to for all . We add at most edges this way. Then we add all edges from to . This adds at most edges to .
We give each node a color . If is unclustered then has color . Otherwise where is the unique index such that . For each pair of indices we define by:
[TABLE]
We first note that for a choice of we can calculate the right hand side of (2) in time since we are taking the minimum over different values. So in time the algorithm calculates for all pairs of indices . Since this is within the time bound. As a consequence of Lemma 2 we get that is a good approximation of , more precisely:
[TABLE]
We now define to be the tree obtained from by contracting each edge in . Since an edge is contained in iff at least one of its endpoints is unclustered we can construct from in time. The algorithm does so for all in time. We note that the shortest path between two nodes in contains exactly the edges on the shortest path between in excluding the edges that are contained in .
The algorithm initializes for all pairs of indices with and let for all . We will maintain that is an upper bound on throughout the algorithm. Now the algorithm goes through all pairs and adds an almost-shortest path between the nodes if needed. Specifically, we do the following:
Let be an upper bound on the number of nodes of the path from to in on line 6. Then Algorithm 1 can implemented in time. Hence we just need to prove that in order to conclude that it can be implemented in time. This follows from the fact that is an almost shortest path and the following reasoning. If contained nodes for some sufficiently large constant it would contain more than nodes of the same color. Since nodes of the same color have distance at most in this would imply that there was a much shorter path from to in contradicting (3) if was chosen large enough. The details with are given in the following lemma:
Lemma 4**.**
The path contains no nodes of color [math], and at most nodes of each color .
Proof.
Obviously does not contain a node with color [math], since all its incident edges would be contained in and hence not in . Now assume for the sake of contradiction that contains nodes of some color . When traversing from to let and be the first and the last node of color respectively. The distance from to when following must be at least by assumption. On the other hand and have distance at most in . So there exists a path in from to that is at least edges shorter that . This contradicts (3). Hence the assumption was wrong and contains at most nodes of each color . β
Since there are different colors the path contains at most nodes and the running time of Algorithm 1 is . So now we just need to prove that is an -spanner and that has at most edges. We start by proving that is an -spanner. Here we will utilize that the is an upper bound on the distance from to in . Furthermore, Algorithm 1 guarantees that . Together with (3) this gives that
[TABLE]
Lemma 5**.**
The subgraph of is an additive -spanner of .
Proof.
Assume for the sake of contradiction that is not an additive -spanner and let be a pair of nodes with shortest possible distance in such that:
[TABLE]
Say that and let be a shortest path from to in where and . Since the pair has the smallest possible distance in such that (5) holds and we have . In particular the edge is not in as it would contradict (5). Hence cannot be unclustered, as all the edges incident to an unclustered node is contained in and therefore . With the same reasoning we conclude that is clustered. Let the colors of and be and respectively. The distances from and to and respectively are at most . Combining this insight with (4) we get:
[TABLE]
But this contradicts the assumption (5). Hence the assumption was wrong and is an additive -spanner of . β
Lastly, we need to prove that contains no more than edges. Informally, we argue the following way. Whenever the edges of are added to on line 7 of Algorithm 1 there are different colors on . For each color on we then argue that either or are made smaller on line 11 or 12 of Algorithm 1. Lastly, we argue that can only be updated times, and since there are variables this implies that Algorithm 1 only adds edges to . This intuition is formalized in Lemma 6 bellow:
Lemma 6**.**
Algorithm 1 adds no more than edges to .
Proof.
Say that the algorithm adds the edges from the path on line 7 of Algorithm 1 where . First we note that since by (3) we have that for every , where we consider to be a function of defined by as on line 10. Now fix and let . Then there is an edge between and and therefore , i.e. . So if Algorithm 1 decreases on line 11 we have after it is decreased. Since is an upper bound on and therefore also an upper bound on we see that can be decreased at most times for each choice of . By symmetry we see that we can also decrease on line 12 at most times. Since there are pairs of indices the algorithm can change the values of or on line 11 and 12 of Algorithm 1 at most times.
Let be a color on . After the execution of lines 9-12 we have
[TABLE]
Due to the execution of lines 2 and 3 this was not the case before. Hence either or were updated. By Lemma 4 there are at least colors on , so if the algorithm adds edges in total it makes at least updates of upper bounds or . Since there can be at most such updates we conclude that and that Algorithm 1 adds no more than edges. β
To summarize, the algorithm presented in this section runs in time and gives an additive -spanner with no more than edges. We have made no attempt to optimize the constant in the -notation. Hence we get:
Theorem 2**.**
There exists an algorithm that given a graph with nodes constructs a -spanner of with edges in time.
4 Distance Oracles
In the following we show how to modify the construction by Sommer [28] to obtain a -distance oracle of size that can be constructed in expected time.
Let be a given graph, and an -spanner of constructed by Theorem 2. is constructed in time and has edges. During the construction we use only space.
Let be a -clustering of . Using Lemma 3 we obtain and in time. For each node we define four portals . We define , where is chosen such that the distance between and in is minimized. In case of ties we choose the node with the lowest index . The node for is chosen depending on . If we let . Otherwise for some index . We let where is chosen among such that the distance between and in is minimized. In case of ties we choose the node with the lowest index . The portals for all nodes can be found in time.
We will use the following lemma by PΗtraΕcu and Roditty [23] to construct a -distance oracle for , that uses space .
Lemma 7** ([23]).**
For any unweighted, undirected graph, there exists a distance oracle of size that, given any nodes and at distance , returns a distance of at most in constant time. The distance oracle can be constructed in expected time .
In the proof in [23] they only claim a running time of , however, this can be fixed to give the correct running time of [25]. By [23, Claim 9] it is easy to see how to get a running time of which suffice for our purposes.
We are now ready to define the distance oracle. For each we store the distances and for all nodes . The distances can be calculated using a BFS in time . For each node we store its portals . We augment this distance oracle with the PΗtraΕcu-Roditty distance oracle from Lemma 7 for .
We now show how to use the distance oracle to obtain approximate distances for a query . We let be the approximate distance in returned by the PΗtraΕcu-Roditty distance oracle. We define in the following way. Let . Then . The distance returned by the distance oracle is the minimum of , and for .
We will now argue that if the the distance between and is , then the distance oracle returns a distance between and . The distance returned is obviously at least , so we just need to show that it is at most . Consider a shortest path between and in . If there is at most one node on the shortest path which is incident to a node in the clustering then the shortest path is contained in , and therefore:
[TABLE]
So assume that there exists a edge on the shortest path not in . Let be the smallest index such that there is an edge on the shortest path with . Say that is closer to than to in . Assume that and for some index (the case where and is handled symmetrically). Since the shortest path is contained in and we have that and therefore:
[TABLE]
Assume that . The other case is handled similarly. Say that for . First assume that for all . Then we conclude that . The distance returned by the distance oracle is at most
[TABLE]
Now assume that for some and let be the smallest index such that . By definition we have that . Furthermore the shortest path is contained in and therefore . The distance returned is at most
[TABLE]
We conclude that the distance returned by the distance oracle is always between and . The result is summarized in Theorem 3.
Theorem 3**.**
For any unweighted, undirected graph, there exists a distance oracle of size that, given any nodes and at distance , returns a distance of at most in constant time. The distance oracle can be constructed in expected time .
Acknowledgements.
The author would like to thank Christian Sommer for helpful discussions on the application of the -spanner to his construction of distance oracles.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Amir Abboud and Greg Bodwin. The 4/3 additive spanner exponent is tight. In Proc. 48th ACM Symposium on Theory of Computing (STOC) , pages 351--361, 2016.
- 2[2] Ittai Abraham and Cyril Gavoille. On approximate distance labels and routing schemes with affine stretch. In Distributed Computing - 25th International Symposium, DISC 2011, Rome, Italy, September 20-22, 2011. Proceedings , pages 404--415, 2011.
- 3[3] Rachit Agarwal. The space-stretch-time tradeoff in distance oracles. In Algorithms - ESA 2014 - 22th Annual European Symposium, Wroclaw, Poland, September 8-10, 2014. Proceedings , pages 49--60, 2014.
- 4[4] Rachit Agarwal and Philip Brighten Godfrey. Brief announcement: a simple stretch 2 distance oracle. In ACM Symposium on Principles of Distributed Computing, PODC β13, Montreal, QC, Canada, July 22-24, 2013 , pages 110--112, 2013.
- 5[5] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM J. Comput. , 28(4):1167--1181, 1999. See also SODAβ96.
- 6[6] Surender Baswana, Akshay Gaur, Sandeep Sen, and Jayant Upadhyay. Distance oracles for unweighted graphs: Breaking the quadratic barrier with constant additive error. In Automata, Languages and Programming, 35th International Colloquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I: Tack A: Algorithms, Automata, Complexity, and Games , pages 609--621, 2008.
- 7[7] Surender Baswana, Vishrut Goyal, and Sandeep Sen. All-pairs nearly 2-approximate shortest paths in I time. Theor. Comput. Sci. , 410(1):84--93, 2009.
- 8[8] Surender Baswana and Telikepalli Kavitha. Faster algorithms for all-pairs approximate shortest paths in undirected graphs. SIAM Journal on Computing , 39(7):2865--2896, 2010.
