Additive Spanners and Distance Oracles in Quadratic Time

Mathias B{\ae}k Tejs Knudsen

arXiv:1704.04473·cs.DS·April 17, 2017

Additive Spanners and Distance Oracles in Quadratic Time

Mathias B{\ae}k Tejs Knudsen

PDF

TL;DR

This paper presents a deterministic algorithm for constructing small additive spanners and improved distance oracles in quadratic time, enhancing previous randomized methods with more efficient and reliable solutions for approximate shortest path computations.

Contribution

It introduces a deterministic, quadratic-time algorithm for additive spanners and improves the construction of approximate distance oracles, reducing randomness and complexity compared to prior work.

Findings

01

Deterministic construction of an additive O(1)-spanner with O(n^{4/3}) edges in O(n^2) time.

02

Development of a Las Vegas (2,1)-distance oracle of size O(n^{5/3}) in O(n^2) time.

03

Enhanced algorithms for approximate all-pairs shortest paths with improved efficiency.

Abstract

Let $G$ be an unweighted, undirected graph. An additive $k$ -spanner of $G$ is a subgraph $H$ that approximates all distances between pairs of nodes up to an additive error of $+ k$ , that is, it satisfies $d_{H} (u, v) \leq d_{G} (u, v) + k$ for all nodes $u, v$ , where $d$ is the shortest path distance. We give a deterministic algorithm that constructs an additive $O (1)$ -spanner with $O (n^{4/3})$ edges in $O (n^{2})$ time. This should be compared with the randomized Monte Carlo algorithm by Woodruff [ICALP 2010] giving an additive $6$ -spanner with $O (n^{4/3} lo g^{3} n)$ edges in expected time $O (n^{2} lo g^{2} n)$ . An $(α, β)$ -approximate distance oracle for $G$ is a data structure that supports the following distance queries between pairs of nodes in $G$ . Given two nodes $u$ , $v$ it can in constant time compute a distance estimate…

Tables2

Table 1. Table 1 : A summary of the performance of selected algorithms that creates a k 𝑘 k -spanner H 𝐻 H from a graph on n 𝑛 n nodes. It shows the additive distortion, k 𝑘 k , and an upper bound on the number of edges in H 𝐻 H as well as the running time of the algorithm that constructs H 𝐻 H .

$k$	Number of Edges	Running Time	Comment	Reference
$2$	$O (n^{3 / 2})$	$O (n^{5 / 2})$	Deterministic	[19]
$2$	$O (n^{3 / 2} \log^{1 / 2} n)$	$O (n^{2} \log^{2} n)$	Deterministic	[18]
$2$	$O (n^{3 / 2})$	$O (n^{2})$	Deterministic	Theorem 1
$6$	$O (n^{4 / 3})$	$O (n^{2 / 3} m)$	Deterministic	[9]
$6$	$O (n^{4 / 3} \log^{3} n)$	$O (n^{2} \log^{2} n)$	Randomized Monte Carlo	[33]
$8$	$O (n^{4 / 3})$	$O (n^{2})$	Deterministic	Theorem 2

Table 2. Table 2 : For a given k 𝑘 k an upper bound of f ( n ) 𝑓 𝑛 f(n) is a proof that any graph on n 𝑛 n nodes has a k 𝑘 k -spanner with no more than f ( n ) 𝑓 𝑛 f(n) edges. A lower bound of g ( n ) 𝑔 𝑛 g(n) is a proof that there exists a graph on n 𝑛 n nodes for which any k 𝑘 k -spanner must have at least g ( n ) 𝑔 𝑛 g(n) edges.

$k$	Upper Bound	Lower Bound	Reference
$2$ & $3$	$O (n^{3 / 2})$	$Ω (n^{3 / 2})$	[19]/[31]
$4$ & $5$	$O (n^{7 / 5} \log^{1 / 5} n)$	$Ω (n^{4 / 3})$	[15]/[11]
$\geq 6$	$O (n^{4 / 3})$	$n^{4 / 3 - o (1)}$	[9]/[1]

Equations36

d_{H} (u, v) \leq d_{G} (u, v) + k,

d_{H} (u, v) \leq d_{G} (u, v) + k,

C_{i} = (Γ_{G} (u_{i}) \cup {u_{i}}) ∖ (C_{1} \cup \dots \cup C_{i - 1}) .

C_{i} = (Γ_{G} (u_{i}) \cup {u_{i}}) ∖ (C_{1} \cup \dots \cup C_{i - 1}) .

d_{T_{i}} (u_{i}, u) + d_{T_{i}} (u_{i}, v) \leq d_{G} (u, v) + 2 .

d_{T_{i}} (u_{i}, u) + d_{T_{i}} (u_{i}, v) \leq d_{G} (u, v) + 2 .

d_{G_{i - 1}} (u_{i}, u) + d_{G_{i - 1}} (u_{i}, v) \leq d_{G_{i - 1}} (w, u) + d_{G_{i - 1}} (w, v) + 2 = d_{G} (u, v) + 2 .

d_{G_{i - 1}} (u_{i}, u) + d_{G_{i - 1}} (u_{i}, v) \leq d_{G_{i - 1}} (w, u) + d_{G_{i - 1}} (w, v) + 2 = d_{G} (u, v) + 2 .

O (m + i = 1 \sum ℓ n ∣ C_{i} ∣) = O (n^{2}) .

O (m + i = 1 \sum ℓ n ∣ C_{i} ∣) = O (n^{2}) .

d_{H} (u, v) \leq d_{G} (u, v) + 2 .

d_{H} (u, v) \leq d_{G} (u, v) + 2 .

δ_{i, j} = k \in {1, 2, \dots, ℓ} min {d_{T_{k}} (u_{k}, u_{i}) + d_{T_{k}} (u_{k}, u_{j})} .

δ_{i, j} = k \in {1, 2, \dots, ℓ} min {d_{T_{k}} (u_{k}, u_{i}) + d_{T_{k}} (u_{k}, u_{j})} .

d_{G} (u_{i}, u_{j}) \leq δ_{i, j} \leq d_{G} (u_{i}, u_{j}) + 2 .

d_{G} (u_{i}, u_{j}) \leq δ_{i, j} \leq d_{G} (u_{i}, u_{j}) + 2 .

d_{H} (u_{i}, u_{j}) \leq d_{G} (u_{i}, u_{j}) + 4 .

d_{H} (u_{i}, u_{j}) \leq d_{G} (u_{i}, u_{j}) + 4 .

d_{H} (u, v) > d_{G} (u, v) + 8 .

d_{H} (u, v) > d_{G} (u, v) + 8 .

d_{H} (u, v) \leq d_{H} (u_{i}, u_{j}) + 2 \leq d_{G} (u_{i}, u_{j}) + 6 \leq d_{G} (u, v) + 8 .

d_{H} (u, v) \leq d_{H} (u_{i}, u_{j}) + 2 \leq d_{G} (u_{i}, u_{j}) + 6 \leq d_{G} (u, v) + 8 .

Δ_{i, r} + Δ_{r, j} \leq δ_{i, j} + 2 .

Δ_{i, r} + Δ_{r, j} \leq δ_{i, j} + 2 .

δ_{P R} (u, v) \leq 2 d_{G_{ℓ}} (u, v) + 1 = 2 d + 1 .

δ_{P R} (u, v) \leq 2 d_{G_{ℓ}} (u, v) + 1 = 2 d + 1 .

min {d_{T_{i}} (u_{i}, u), d_{T_{i}} (u_{i}, v)} \leq \frac{d + 1}{2} .

min {d_{T_{i}} (u_{i}, u), d_{T_{i}} (u_{i}, v)} \leq \frac{d + 1}{2} .

δ_{1} (u, v)

δ_{1} (u, v)

\leq 2 d_{T_{k_{1}}} (p_{1} (u), u) + d_{H} (u, v)

\leq 2 (d_{T_{i}} (u_{i}, u) - 4) + d + 8 \leq 2 d + 1 .

δ_{j} (u, v)

δ_{j} (u, v)

\leq 2 d_{T_{k_{j}}} (p_{j} (u), u) + d

\leq 2 d_{T_{i}} (u_{i}, u) + d \leq 2 d + 1 .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Additive Spanners and Distance Oracles in Quadratic Time

Mathias Bæk Tejs Knudsen

Research partly supported by Advanced Grant DFF-0602-02499B from the Danish Council for Independent Research under the Sapere Aude research career programme and by the FNU project AlgoDisc - Discrete Mathematics, Algorithms, and Data Structures University of Copenhagen,

[email protected]

Abstract

Let $G$ be an unweighted, undirected graph. An additive $k$ -spanner of $G$ is a subgraph $H$ that approximates all distances between pairs of nodes up to an additive error of $+k$ , that is, it satisfies $d_{H}(u,v)\leq d_{G}(u,v)+k$ for all nodes $u,v$ , where $d$ is the shortest path distance. We give a deterministic algorithm that constructs an additive $O\!\left(1\right)$ -spanner with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time. This should be compared with the randomized Monte Carlo algorithm by Woodruff [ICALP 2010] giving an additive $6$ -spanner with $O\!\left(n^{4/3}\log^{3}n\right)$ edges in expected time $O\!\left(n^{2}\log^{2}n\right)$ .

An $(\alpha,\beta)$ -approximate distance oracle for $G$ is a data structure that supports the following distance queries between pairs of nodes in $G$ . Given two nodes $u$ , $v$ it can in constant time compute a distance estimate $\tilde{d}$ that satisfies $d\leq\tilde{d}\leq\alpha d+\beta$ where $d$ is the distance between $u$ and $v$ in $G$ . Sommer [ICALP 2016] gave a randomized Monte Carlo $(2,1)$ -distance oracle of size $O\!\left(n^{5/3}\operatorname{poly}\log n\right)$ in expected time $O\!\left(n^{2}\operatorname{poly}\log n\right)$ . As an application of the additive $O\!\left(1\right)$ -spanner we improve the construction by Sommer [ICALP 2016] and give a Las Vegas $(2,1)$ -distance oracle of size $O\!\left(n^{5/3}\right)$ in time $O\!\left(n^{2}\right)$ . This also implies an algorithm that in $O\!\left(n^{2}\right)$ gives approximate distance for all pairs of nodes in $G$ improving on the $O\!\left(n^{2}\log n\right)$ algorithm by Baswana and Kavitha [SICOMP 2010].

1 Introduction

Let $G=(V,E)$ be an unweighted, undirected graph on $n$ nodes and $m$ edges. A subgraph $H$ of $G$ is an additive $k$ -spanner if the following holds for every pair $u,v$ of nodes in $G$ :

[TABLE]

where $d_{H}(u,v)$ and $d_{G}(u,v)$ is the distance between $u$ and $v$ in $H$ and $G$ respectively. This paper will only consider additive spanners and not multiplicative or mixed spanners, so we will simply say that $H$ is a $k$ -spanner when we mean that $H$ is an additive $k$ -spanner.

In this paper we consider algorithms constructing $k$ -spanners, and there are therefore three interesting parameters: The distortion $k$ , the running time of the algorithm, and the size of the spanner created. Elkin and Peleg [19] showed how to construct $2$ -spanners with $O\!\left(n^{3/2}\right)$ edges in $O\!\left(n^{5/2}\right)$ time, and Baswana et al [9] gave an algorithm that constructs $6$ -spanners with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2/3}m\right)$ time.

The running time of these algorithms can be improved if we allow the $k$ -spanners to be larger by a $\operatorname{poly}\log n$ factor. Dor, Halperin and Zwick [18] showed that we can construct $2$ -spanners with $O\!\left(n^{3/2}\log^{1/2}n\right)$ edges in $O\!\left(n^{2}\log^{2}n\right)$ time, and Woodruff [33] gave an algorithm to construct $6$ -spanners with $O\!\left(n^{4/3}\log^{3}n\right)$ edges in $O\!\left(n^{2}\log^{2}n\right)$ time. The construction of Woodruff is furthermore randomized Monte Carlo. These results are summarized in Table 1.

These improvements to the running time fit into the following paradigm: For a fixed $k$ the authors find algorithms that produce spanners that are almost as small as the best known construction of $k$ -spanners and have near-quadratic running time. We reverse this way of looking at the problem. We are now trying to find algorithms that yield $k$ -spanners that are exactly as small as the best known constructions for any $k=O(1)$ , i.e. $O(n^{4/3})$ , and at the same time we want the algorithm to run as fast as possible. All known algorithms for creating $O(1)$ -spanners that have close to optimal size run in time $\Omega(n^{2})$ . 111For instance the algorithm by Baswana et al [9] gives a $6$ -spanner with $O\!\left(n^{4/3}\right)$ edges and is therefore only interesting when $m=\Omega\!\left(n^{4/3}\right)$ , in which case the running time is $\Theta\!\left(n^{2/3}m\right)=\Omega\!\left(n^{2}\right)$ . So a natural question is to ask if there exists a $k=O(1)$ and an algorithm that constructs a $k$ -spanner with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time. In fact Sommer [28] mentioned at his talk at ICALP 2016 that the main obstacle towards getting a better running time for constructing the distance oracle he presented is the lack of such an algorithm. In his case the distortion $k=O(1)$ is only factored into the running time and not the distortion of oracle. Therefore, it does not matter what $k$ is as long as it is constant.

We show that it possible to attain this goal by giving an algorithm that constructs $8$ -spanners deterministically with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time. Comparing this with the algorithm by Woodruff [33] this gets rid of the $\log^{3}n$ factor on the number of edges and a factor of $\log^{2}n$ in the running time. Furthermore, the algorithm is deterministic and not randomized Monte Carlo. The price of these improvements is that the distortion is larger than $6$ . We note that there are no lower bounds ruling out the possibility of a $4$ -spanner with $O\!\left(n^{4/3}\right)$ edges. For the application to the distance oracle by Sommer [28], the distortion is unimportant as long as it is constant. We also show how to construct $2$ -spanners with $O\!\left(n^{3/2}\right)$ edges in $O\!\left(n^{2}\right)$ time. For a comparison to previous work see Table 1.

Related work

Elkin and Peleg [19] showed that222 Aingworth et al [5] earlier showed the same result up to logarithmic factors on the size of the spanner. any graph on $n$ nodes has a $2$ -spanner with $O(n^{3/2})$ edges, Chechik [15] showed that it has a $4$ -spanner with $O\!\left(n^{7/5}\log^{1/5}n\right)$ edges, and Baswana et al [9] showed that it has a $6$ -spanner with $O(n^{4/3})$ edges. These results are complemented by a negative result of Abboud and Bodwin [1]. A consequence of their result is that for any $k=O(1)$ there exists a graph on $n$ nodes such that any $k$ -spanner of this graph has at least $n^{4/3-o(1)}$ edges.

Another negative result comes from Erdős’s girth conjecture [20]. It states that for any constant $k$ there exists graphs with $n$ nodes and $\Omega\!\left(n^{1+1/k}\right)$ edges where the girth is $2k+2$ . This conjecture has been proved for $k=2,3,5$ [31, 11]. In particular if the conjecture is true this implies that there exists graphs for which any $(2k-1)$ -spanner must have at least $\Omega\!\left(n^{1+1/k}\right)$ edges. Woodruff [32] proved that whether the conjecture is true or not, there exists a graph on $n$ nodes such that any $(2k-1)$ -spanner of the graph has at least $\Omega\!\left(k^{-1}n^{1+1/k}\right)$ edges.

There are also upper and lower bounds when we allow the distortion $k$ to depend on $n$ , see [14, 13, 15, 22]. In this paper, however, we are only interested in the case where $k=O(1)$ . The upper and lower bounds for $k=O(1)$ are summarized in Table 2.

Techniques

Previous algorithms that construct $k$ -spanners in $\tilde{O}\!\left(n^{2}\right)$ time all relied on constructing a hitting set for some set of neighbourhoods. In [18] this is done deterministically via a dominating set algorithm, and in [33] this is done via sampling. This approach will inherently come with the cost of a $\operatorname{poly}\log n$ factor. Furthermore, in the construction of $6$ -spanners by Woodruff [33] the number of neighbourhoods that need to be hit is so large that it seems impossible with current techniques to modify the algorithm to be Las Vegas. Too avoid this we instead use a clustering approach described in Section 2. The algorithm in Theorem 2 is obtained using this clustering and a careful modification of the path-buying algorithm of [9].

Approximate Distance Oracles and All Pairs Almost Shortest Paths

Given an undirected an unweighted graph $G$ an $(\alpha,\beta)$ -approximate distance oracle for $G$ is a data structure that supports the following query. Given two nodes $u$ , $v$ it can compute a distance estimate $\tilde{d}$ that satisfies $d\leq\tilde{d}\leq\alpha d+\beta$ where $d$ is the distance between $u$ and $v$ in $G$ . For work on approximate distance oracles see e.g. [2, 3, 4, 6, 7, 8, 10, 12, 16, 17, 23, 24, 26, 27, 29, 30, 34]. Sommer [28] gave a randomized Monte Carlo $(2,1)$ -distance oracle that can be constructed in $O\!\left(n^{2}\operatorname{poly}\log n\right)$ time, has size $O\!\left(n^{5/3}\operatorname{poly}\log n\right)$ and can answer queries in $O\!\left(1\right)$ time. We improve the construction time and the size to $O\!\left(n^{2}\right)$ and $O\!\left(n^{5/3}\right)$ respectively, and our construction is randomized Las Vegas. As a corollary we can compute an estimate $\tilde{d}(u,v)$ for all pairs of nodes in $G$ satisfying $d_{G}(u,v)\leq\tilde{d}(u,v)\leq 2d_{G}(u,v)+1$ in time $O\!\left(n^{2}\right)$ . This improves upon the $O\!\left(n^{2}\log n\right)$ algorithm by Baswana and Kavitha [8].

Preliminaries

For a graph $G$ and two nodes $u,v$ we denote the distance from $u$ to $v$ in $G$ by $d_{G}(u,v)$ . All graphs considered in this paper are unweighted, and unless otherwise specified they are undirected as well. For an undirected graph $G$ an a node $u$ the neighbourhood of $u$ is the set of nodes adjacent to $u$ and is denoted by $\Gamma_{G}(u)$ .

Overview

In Section 2 we introduce the clustering we use when constructing the spanners. In Section 3 we show how to create an $8$ -spanner with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time and thereby prove Theorem 2. In Section 4 we provide the details on how to give an improved $(2,1)$ -distance oracle.

2 Clustering

Our construction of additive spanners uses standard clustering techniques. We present our clustering framework below. Let $G=(V,E)$ be a graph with $n$ vertices and $m$ edges. We let $t$ be a parameter that can depend on $G$ . For a sequence $u_{1},\ldots,u_{\ell}$ of nodes we define the clusters $C_{i},i\in\left\{1,\ldots,\ell\right\}$ by

[TABLE]

Furthermore we also define graphs $G_{0},G_{1},\ldots,G_{\ell}$ in the following way. We let $G_{0}=G$ , and for $i>0$ we let $G_{i}$ be the subgraph of $G$ that contains an edge $(u,v)$ if not both $u$ and $v$ are contained in $C_{1}\cup\ldots\cup C_{i}$ . From each node $u_{i}$ we let $T_{i}$ be a BFS tree in $G_{i-1}$ rooted at $u_{i}$ .

Definition 1.

A sequence $u_{1},\ldots,u_{\ell}$ is called a $t$ -clustering if the following requirements are satisfied.

•

The node $u_{i}$ maximizes $\left(\Gamma_{G}(u_{i})\cup\left\{u_{i}\right\}\right)\setminus\left(C_{1}\cup\ldots\cup C_{i-1}\right)$ .

•

Every cluster $C_{i}$ contains at least $t$ nodes.

•

For every node $v$ we have $\left|\left(\Gamma_{G}(v)\cup\left\{v\right\}\right)\setminus\left(C_{1}\cup\ldots\cup C_{\ell}\right)\right|<t$ .

We say that a node $v$ is clustered if $v\in C_{1}\cup\ldots\cup C_{\ell}$ and unclustered otherwise. We note that since every cluster $C_{i}$ contains at least $t$ nodes and the clusters are disjoint we have $\ell\leq\frac{n}{t}$ .

Lemma 1.

Let $u_{1},\ldots,u_{\ell}$ be a $t$ -clustering. Then the number of edges in $G_{\ell}$ is at most $nt$ .

Proof.

The number of edges in $G_{\ell}$ is bounded by the sum $\sum_{v\in V}\left|\left(\Gamma_{G}(v)\right)\setminus\left(C_{1}\cup\ldots\cup C_{\ell}\right)\right|$ , which is clearly less than $nt$ . ∎

Lemma 2.

Let $u_{1},\ldots,u_{\ell}$ be a $t$ -clustering of $G=(V,E)$ and let $u,v\in V$ be a pair of nodes. Assume that some shortest path from $u$ to $v$ in $G$ is not contained in $G_{\ell}$ from Lemma 1. Then there exists an index $i\in\left\{1,2,\ldots,\ell\right\}$ such that

[TABLE]

Proof.

Consider a shortest path $p$ from $u$ to $v$ that is not contained in $G_{\ell}$ and let $w$ be a clustered node on $p$ such that $w\in C_{i}$ . We choose $w$ such that $i$ is smallest possible. By choosing $i$ smallest possible $p$ is contained in $G_{i-1}$ . Furthermore since the distance from $w$ to $u_{i}$ is at most $1$ we see that

[TABLE]

Since $T_{i}$ is a is shortest path tree in $G_{i-1}$ the conclusion follows. ∎

Lemma 3.

Given a graph $G$ and a parameter $t>0$ we can construct a $t$ -clustering $u_{1},\ldots,u_{\ell}$ , the corresponding BFS trees $T_{1},\ldots,T_{\ell}$ and $G_{\ell}$ in $O(n^{2})$ time.

Proof.

The algorithm will work by finding the nodes $u_{1},\ldots,u_{\ell}$ consecutively, i.e. first $u_{1}$ , then $u_{2}$ and so on. The algorithm will maintain a graph $G^{\prime}$ . In the beginning of the algorithm we have $G^{\prime}=G_{0}$ , and after we add $u_{i}$ we will alter $G^{\prime}$ such that $G^{\prime}=G_{i}$ . The total cost of altering all $G^{\prime}$ will be $O(m)=O(n^{2})$ .

We find $u_{i}$ by looking at all nodes in $G^{\prime}=G_{i-1}$ and count the number of neighbours not in $C_{1}\cup\ldots\cup C_{i-1}$ . Since $G_{i-1}$ has at most $n\left|C_{i}\right|$ edges this takes $O(n\left|C_{i}\right|)$ time. Then the algorithm finds a BFS tree from $u_{i}$ in $G_{i-1}$ in $O(n\left|C_{i}\right|)$ time. Hence the total time used by the algorithm is:

[TABLE]

∎

3 Constructing $O(1)$ -Spanners

In this section we present our construction of an $8$ -spanner with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time. As a warmup we show how we can use the clustering from Section 2 to give a $2$ -spanner with $O\!\left(n^{3/2}\right)$ edges in $O\!\left(n^{2}\right)$ time.

Theorem 1.

There exists an algorithm that given a graph $G$ with $n$ nodes constructs a $2$ -spanner of $G$ with $\leq 2n^{3/2}$ edges in $O\!\left(n^{2}\right)$ time.

Proof.

Let $t=\sqrt{n}$ and construct a $t$ -clustering $u_{1},\ldots,u_{\ell}$ with Lemma 3. Let $H=T_{1}\cup\ldots\cup T_{\ell}\cup G_{\ell}$ . The number of edges in $H$ is at most $n\ell+nt\leq 2n\sqrt{n}$ by Lemma 1 and the fact that $\ell\leq\frac{n}{t}$ .

Now we just need to prove that $H$ is a $2$ -spanner. Let $u,v$ be arbitrary nodes and let $p$ be a shortest path from $u$ to $v$ in $G$ . We wish to prove that

[TABLE]

If $p$ is contained in $G_{\ell}$ then (1) is obviously true. Otherwise there exists an index $i$ such that $d_{T_{i}}(u,v)\leq d_{G}(u,v)+2$ by Lemma 2, and (1) is true since $T_{i}\subseteq H$ . ∎

Next we turn to showing how to create an $8$ -spanner $H$ with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time. The idea is the following. We start by creating a $t$ -clustering $u_{1},\ldots,u_{\ell}$ with $t=n^{1/3}$ and $\ell\leq n^{2/3}$ . Using the BFS trees $T_{1},\ldots,T_{\ell}$ along with Lemma 2 we can then get an additive $2$ -approximation of $d_{G}(u_{i},u_{j})$ for all pairs of indices $i,j$ , which we will call $\delta_{i,j}$ . The calculation of the BFS trees in $O\!\left(n^{2}\right)$ time relies on an idea similar to one in [5]. The BFS trees also gives us a path from $u_{i}$ to $u_{j}$ that is at most $2$ longer than the shortest path. If we add all these shortest paths to our spanner along with $G_{\ell}$ and the neighbours in $C_{i}$ of each $u_{i}$ we will get a $6$ -spanner. Unfortunately, adding a path could require adding up to $\Omega(\ell)$ edges, and since there are $\ell^{2}$ pairs we can only guarantee that the spanner has $O\!\left(\ell^{3}\right)$ edges, which is $O\!\left(n^{2}\right)$ if $\ell\approx n^{2/3}$ . (We only need to add edges on the path that are not already in $G_{\ell}$ ) Instead we use an argument similar to the path-buying argument from [9] and the construction from [21]. We add the path from $u_{i}$ to $u_{j}$ unless we can guarantee that there is already an additive $2$ -approximation of this path in the spanner already. We do this by maintaining an upper bound $\Delta_{i,j}$ on the distance from $u_{i}$ to $u_{j}$ in the spanner $H$ . We then argue that if we add a path of with $k$ edges not already in the spanner, then there are $\Omega(k)$ pairs $u_{i^{\prime}},u_{j^{\prime}}$ for which the upper bound $\Delta_{i^{\prime},j^{\prime}}$ is improved. Then, this will imply that at most $O\!\left(\ell^{2}\right)$ edges are added giving an upper bound of $O\!\left(n^{4/3}\right)$ on the number of edges in $H$ .

After this informal discussion of the construction we turn to the details. The algorithm is given a graph $G=(V,E)$ with $n$ nodes and $m$ edges, and will return a spanner $H=(V,F)$ . Initially $F=\emptyset$ and we will add edges to $H$ such that $H$ becomes a $8$ -spanner of $G$ . The algorithm starts by creating a $t$ -clustering $u_{1},\ldots,u_{\ell}$ with $t=n^{1/3}$ using Lemma 3 in $O\!\left(n^{2}\right)$ time. Since $\ell\leq\frac{n}{t}$ we have $\ell\leq n^{2/3}$ . Then we add edges from $u_{i}$ to all nodes in $C_{i}\setminus\left\{u_{i}\right\}$ to $H$ for all $i\in\left\{1,2,\ldots,\ell\right\}$ . We add at most $n$ edges this way. Then we add all edges from $G_{\ell}$ to $H$ . This adds at most $nt=n^{4/3}$ edges to $H$ .

We give each node $u\in V$ a color $c(u)\in\left\{0,1,2,\ldots,\ell\right\}$ . If $u$ is unclustered then $u$ has color $c(u)=0$ . Otherwise $c(u)=i$ where $i$ is the unique index such that $u\in C_{i}$ . For each pair of indices $i,j\in\left\{1,2,\ldots,\ell\right\}$ we define $\delta_{i,j}$ by:

[TABLE]

We first note that for a choice of $i,j$ we can calculate the right hand side of (2) in $O\!\left(\ell\right)$ time since we are taking the minimum over $\ell$ different values. So in $O\!\left(\ell^{3}\right)$ time the algorithm calculates $\delta_{i,j}$ for all pairs of indices $i,j$ . Since $\ell\leq n^{2/3}$ this is within the $O\!\left(n^{2}\right)$ time bound. As a consequence of Lemma 2 we get that $\delta_{i,j}$ is a good approximation of $d_{G}(u_{i},u_{j})$ , more precisely:

[TABLE]

We now define $T_{i}^{\prime}$ to be the tree obtained from $T_{i}$ by contracting each edge in $G_{\ell}$ . Since an edge is contained in $G_{\ell}$ iff at least one of its endpoints is unclustered we can construct $T_{i}^{\prime}$ from $T_{i}$ in $O\!\left(n\right)$ time. The algorithm does so for all $i\in\left\{1,2,\ldots,\ell\right\}$ in $O\!\left(n\ell\right)=O\!\left(n^{5/3}\right)$ time. We note that the shortest path between two nodes $u,v$ in $T_{i}^{\prime}$ contains exactly the edges on the shortest path between $u,v$ in $T_{i}$ excluding the edges that are contained in $G_{\ell}$ .

The algorithm initializes $\Delta_{i,j}=\infty$ for all pairs of indices $i,j$ with $i\neq j$ and let $\Delta_{i,i}=0$ for all $i$ . We will maintain that $\Delta_{i,j}$ is an upper bound on $d_{H}(u_{i},u_{j})$ throughout the algorithm. Now the algorithm goes through all pairs $u_{i},u_{j}$ and adds an almost-shortest path between the nodes if needed. Specifically, we do the following:

Let $L$ be an upper bound on the number of nodes of the path $p$ from $u_{i}$ to $u_{j}$ in $T_{k}^{\prime}$ on line 6. Then Algorithm 1 can implemented in $O\!\left(\ell^{3}+\ell^{2}L\right)$ time. Hence we just need to prove that $L=O\!\left(\ell\right)$ in order to conclude that it can be implemented in $O\!\left(\ell^{3}\right)=O\!\left(n^{2}\right)$ time. This follows from the fact that $p$ is an almost shortest path and the following reasoning. If $p$ contained $>C\ell$ nodes for some sufficiently large constant $C$ it would contain more than $C$ nodes of the same color. Since nodes of the same color have distance at most $2$ in $G$ this would imply that there was a much shorter path from $u$ to $v$ in $G$ contradicting (3) if $C$ was chosen large enough. The details with $C=5$ are given in the following lemma:

Lemma 4.

The path $p$ contains no nodes of color [math], and at most $5$ nodes of each color $\neq 0$ .

Proof.

Obviously $p$ does not contain a node with color [math], since all its incident edges would be contained in $G_{\ell}$ and hence not in $T_{k}^{\prime}$ . Now assume for the sake of contradiction that $p$ contains $6$ nodes of some color $r\neq 0$ . When traversing $p$ from $u_{i}$ to $u_{j}$ let $\alpha$ and $\beta$ be the first and the last node of color $r$ respectively. The distance from $\alpha$ to $\beta$ when following $p$ must be at least $5$ by assumption. On the other hand $\alpha$ and $\beta$ have distance at most $2$ in $G$ . So there exists a path in $G$ from $u_{i}$ to $u_{j}$ that is at least $3$ edges shorter that $p$ . This contradicts (3). Hence the assumption was wrong and $p$ contains at most $5$ nodes of each color $\neq 0$ . ∎

Since there are $\ell$ different colors $\neq 0$ the path $p$ contains at most $5\ell$ nodes and the running time of Algorithm 1 is $O\!\left(n^{2}\right)$ . So now we just need to prove that $H$ is an $8$ -spanner and that $H$ has at most $O\!\left(n^{4/3}\right)$ edges. We start by proving that $H$ is an $8$ -spanner. Here we will utilize that the $\Delta_{i,j}$ is an upper bound on the distance from $u_{i}$ to $u_{j}$ in $H$ . Furthermore, Algorithm 1 guarantees that $\Delta_{i,j}\leq\delta_{i,j}+2$ . Together with (3) this gives that

[TABLE]

Lemma 5.

The subgraph $H$ of $G$ is an additive $8$ -spanner of $G$ .

Proof.

Assume for the sake of contradiction that $H$ is not an additive $8$ -spanner and let $u,v$ be a pair of nodes with shortest possible distance in $G$ such that:

[TABLE]

Say that $d_{G}(u,v)=D$ and let $p=(w_{0},w_{1},\ldots,w_{D})$ be a shortest path from $u$ to $v$ in $G$ where $w_{0}=u$ and $w_{D}=v$ . Since the pair $(u,v)$ has the smallest possible distance in $G$ such that (5) holds and $d_{G}(w_{1},v)=D-1$ we have $d_{H}(w_{1},v)\leq(D-1)+8$ . In particular the edge $(u,w_{1})$ is not in $H$ as it would contradict (5). Hence $u$ cannot be unclustered, as all the edges incident to an unclustered node is contained in $G_{\ell}$ and therefore $H$ . With the same reasoning we conclude that $v$ is clustered. Let the colors of $u$ and $v$ be $i$ and $j$ respectively. The distances from $u$ and $v$ to $u_{i}$ and $u_{j}$ respectively are at most $1$ . Combining this insight with (4) we get:

[TABLE]

But this contradicts the assumption (5). Hence the assumption was wrong and $H$ is an additive $8$ -spanner of $G$ . ∎

Lastly, we need to prove that $H$ contains no more than $O\!\left(n^{4/3}\right)$ edges. Informally, we argue the following way. Whenever the $s-1$ edges of $p$ are added to $H$ on line 7 of Algorithm 1 there are $\Omega(s)$ different colors on $p$ . For each color $r$ on $p$ we then argue that either $\Delta_{i,r}$ or $\Delta_{r,j}$ are made smaller on line 11 or 12 of Algorithm 1. Lastly, we argue that $\Delta_{i,j}$ can only be updated $O\!\left(1\right)$ times, and since there are $\ell^{2}\leq n^{4/3}$ variables $\Delta_{i,j}$ this implies that Algorithm 1 only adds $O\!\left(n^{4/3}\right)$ edges to $H$ . This intuition is formalized in Lemma 6 bellow:

Lemma 6.

Algorithm 1 adds no more than $25\ell^{2}$ edges to $H$ .

Proof.

Say that the algorithm adds the edges from the path $p=(w_{0},w_{1},\ldots,w_{s-1})$ on line 7 of Algorithm 1 where $w_{0}=u_{i},w_{s-1}=u_{j}$ . First we note that since $d_{G}(u_{i},u_{j})\geq\delta_{i,j}-2$ by (3) we have that $d_{G}(u_{i},w_{x})\geq y-2$ for every $x\in\left\{0,1,\ldots,s-1\right\}$ , where we consider $y$ to be a function of $x$ defined by $y=d_{T_{k}}(u_{i},w_{x})$ as on line 10. Now fix $x$ and let $r=c(w_{x})$ . Then there is an edge between $w_{x}$ and $u_{r}$ and therefore $d_{G}(u_{i},u_{r})\geq y-3$ , i.e. $y+1\leq d_{G}(u_{i},u_{r})+4$ . So if Algorithm 1 decreases $\Delta_{i,r}$ on line 11 we have $\Delta_{i,r}\leq d_{G}(u_{i},u_{r})+4$ after it is decreased. Since $\Delta_{i,r}$ is an upper bound on $d_{H}(u_{i},u_{r})$ and therefore also an upper bound on $d_{G}(u_{i},u_{r})$ we see that $\Delta_{i,r}$ can be decreased at most $5$ times for each choice of $i,r$ . By symmetry we see that we can also decrease $\Delta_{r,j}$ on line 12 at most $5$ times. Since there are $\ell^{2}$ pairs of indices the algorithm can change the values of $\Delta_{i,r}$ or $\Delta_{r,j}$ on line 11 and 12 of Algorithm 1 at most $5\ell^{2}$ times.

Let $r$ be a color on $p$ . After the execution of lines 9-12 we have

[TABLE]

Due to the execution of lines 2 and 3 this was not the case before. Hence either $\Delta_{i,r}$ or $\Delta_{r,j}$ were updated. By Lemma 4 there are at least $\frac{s}{5}$ colors on $p$ , so if the algorithm adds $A$ edges in total it makes at least $\frac{A}{5}$ updates of upper bounds $\Delta_{i,r}$ or $\Delta_{r,j}$ . Since there can be at most $5\ell^{2}$ such updates we conclude that $\frac{A}{5}\leq 5\ell^{2}$ and that Algorithm 1 adds no more than $5\ell^{2}$ edges. ∎

To summarize, the algorithm presented in this section runs in $O\!\left(n^{2}\right)$ time and gives an additive $8$ -spanner with no more than $26n^{4/3}+n=O\!\left(n^{4/3}\right)$ edges. We have made no attempt to optimize the constant in the $O$ -notation. Hence we get:

Theorem 2.

There exists an algorithm that given a graph $G$ with $n$ nodes constructs a $8$ -spanner of $G$ with $O\!\left(n^{4/3}\right)$ edges in $O\!\left(n^{2}\right)$ time.

4 Distance Oracles

In the following we show how to modify the construction by Sommer [28] to obtain a $(2,1)$ -distance oracle of size $O\!\left(n^{5/3}\right)$ that can be constructed in expected $O\!\left(n^{2}\right)$ time.

Let $G$ be a given graph, and $H$ an $8$ -spanner of $G$ constructed by Theorem 2. $H$ is constructed in $O\!\left(n^{2}\right)$ time and has $O\!\left(n^{4/3}\right)$ edges. During the construction we use only $O\!\left(n^{5/3}\right)$ space.

Let $u_{1},u_{2},\ldots,u_{\ell}$ be a $n^{1/3}$ -clustering of $G$ . Using Lemma 3 we obtain $T_{1},\ldots,T_{\ell}$ and $G_{\ell}$ in $O\!\left(n^{2}\right)$ time. For each node $v$ we define four portals $p_{1}(v),p_{2}(v),p_{3}(v),p_{4}(v)$ . We define $p_{1}(v)=u_{i}$ , where $u_{i}$ is chosen such that the distance between $v$ and $u_{i}$ in $T_{i}$ is minimized. In case of ties we choose the node $u_{i}$ with the lowest index $i$ . The node $p_{j+1}(v)$ for $j=1,2,3$ is chosen depending on $p_{j}(v)$ . If $p_{j}(v)=u_{1}$ we let $p_{j+1}(v)=u_{1}$ . Otherwise $p_{j}(v)=u_{i}$ for some index $i$ . We let $p_{j+1}(v)=u_{i^{\prime}}$ where $u_{i^{\prime}}$ is chosen among $u_{1},u_{2},\ldots,u_{i-1}$ such that the distance between $u_{i^{\prime}}$ and $v$ in $T_{i^{\prime}}$ is minimized. In case of ties we choose the node $u_{i^{\prime}}$ with the lowest index $i^{\prime}$ . The portals for all nodes can be found in $O\!\left(n^{5/3}\right)$ time.

We will use the following lemma by Pǎtraşcu and Roditty [23] to construct a $(2,1)$ -distance oracle for $G_{\ell}$ , that uses space $O\!\left(n^{5/3}\right)$ .

Lemma 7 ([23]).

For any unweighted, undirected graph, there exists a distance oracle of size $O\!\left(n^{5/3}\right)$ that, given any nodes $u$ and $v$ at distance $d$ , returns a distance of at most $2d+1$ in constant time. The distance oracle can be constructed in expected time $O\!\left(mn^{2/3}\right)$ .

In the proof in [23] they only claim a running time of $O\!\left(mn^{2/3}+n^{7/3}\right)$ , however, this can be fixed to give the correct running time of $O\!\left(mn^{2/3}\right)$ [25]. By [23, Claim 9] it is easy to see how to get a running time of $O\!\left(mn^{2/3}+n^{2}\right)$ which suffice for our purposes.

We are now ready to define the distance oracle. For each $i=1,2,\ldots,\ell$ we store the distances $d_{T_{i}}(u_{i},v)$ and $d_{H}(u_{i},v)$ for all nodes $v$ . The distances $d_{H}(u_{i},v)$ can be calculated using a BFS in time $O\!\left(\ell n^{4/3}\right)=O\!\left(n^{2}\right)$ . For each node $v$ we store its portals $p_{j}(v),j=1,2,3,4$ . We augment this distance oracle with the Pǎtraşcu-Roditty distance oracle from Lemma 7 for $G_{\ell}$ .

We now show how to use the distance oracle to obtain approximate distances for a query $u,v$ . We let $\delta_{PR}(u,v)$ be the approximate distance in $G_{\ell}$ returned by the Pǎtraşcu-Roditty distance oracle. We define $\delta_{j}(u,v)$ in the following way. Let $p_{j}(u)=u_{i}$ . Then $\delta_{j}(u,v)=d_{T_{i}}(u_{i},u)+\min\left\{d_{T_{i}}(u_{i},v),d_{H}(u_{i},v)\right\}$ . The distance returned by the distance oracle is the minimum of $\delta_{PR}(u,v)$ , $\delta_{j}(u,v)$ and $\delta_{j}(v,u)$ for $j=1,2,3,4$ .

We will now argue that if the the distance between $u$ and $v$ is $d$ , then the distance oracle returns a distance between $d$ and $2d+1$ . The distance returned is obviously at least $d$ , so we just need to show that it is at most $2d+1$ . Consider a shortest path between $u$ and $v$ in $G$ . If there is at most one node on the shortest path which is incident to a node $u_{i}$ in the clustering then the shortest path is contained in $G_{\ell}$ , and therefore:

[TABLE]

So assume that there exists a edge on the shortest path not in $G_{\ell}$ . Let $i$ be the smallest index such that there is an edge $(z,t)$ on the shortest path with $z,t\in C_{1}\cup\ldots\cup C_{i}$ . Say that $z$ is closer to $u$ than to $v$ in $G$ . Assume that $z\in C_{i}$ and $t\in C_{i^{\prime}}$ for some index $i^{\prime}\leq i$ (the case where $z\in C_{i^{\prime}}$ and $t\in C_{i}$ is handled symmetrically). Since the shortest path is contained in $G_{i-1}$ and $G_{i^{\prime}-1}$ we have that $d_{T_{i}}(u_{i},u)+d_{T_{i^{\prime}}}(u_{i^{\prime}},v)\leq d+1$ and therefore:

[TABLE]

Assume that $d_{T_{i}}(u_{i},u)\leq\frac{d+1}{2}$ . The other case is handled similarly. Say that $p_{j}(u)=u_{k_{j}}$ for $j=1,2,3,4$ . First assume that $k_{j}>i$ for all $j=1,2,3,4$ . Then we conclude that $d_{T_{k_{1}}}(p_{1}(u),u)\leq d_{T_{i}}(u_{i},u)-4$ . The distance returned by the distance oracle is at most

[TABLE]

Now assume that $k_{j}\leq i$ for some $j\in\left\{1,2,3,4\right\}$ and let $j$ be the smallest index such that $k_{j}\leq i$ . By definition we have that $d_{T_{k_{j}}}(p_{j}(u),u)\leq d_{T_{i}}(u_{i},u)$ . Furthermore the shortest path is contained in $G_{j-1}$ and therefore $d_{T_{k_{j}}}(p_{j}(u),v)\leq d_{T_{k_{j}}}(p_{j}(u),u)+d_{G}(u,v)$ . The distance returned is at most

[TABLE]

We conclude that the distance returned by the distance oracle is always between $d$ and $2d+1$ . The result is summarized in Theorem 3.

Theorem 3.

For any unweighted, undirected graph, there exists a distance oracle of size $O\!\left(n^{5/3}\right)$ that, given any nodes $u$ and $v$ at distance $d$ , returns a distance of at most $2d+1$ in constant time. The distance oracle can be constructed in expected time $O\!\left(n^{2}\right)$ .

Acknowledgements.

The author would like to thank Christian Sommer for helpful discussions on the application of the $8$ -spanner to his construction of distance oracles.

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Amir Abboud and Greg Bodwin. The 4/3 additive spanner exponent is tight. In Proc. 48th ACM Symposium on Theory of Computing (STOC) , pages 351--361, 2016.
2[2] Ittai Abraham and Cyril Gavoille. On approximate distance labels and routing schemes with affine stretch. In Distributed Computing - 25th International Symposium, DISC 2011, Rome, Italy, September 20-22, 2011. Proceedings , pages 404--415, 2011.
3[3] Rachit Agarwal. The space-stretch-time tradeoff in distance oracles. In Algorithms - ESA 2014 - 22th Annual European Symposium, Wroclaw, Poland, September 8-10, 2014. Proceedings , pages 49--60, 2014.
4[4] Rachit Agarwal and Philip Brighten Godfrey. Brief announcement: a simple stretch 2 distance oracle. In ACM Symposium on Principles of Distributed Computing, PODC ’13, Montreal, QC, Canada, July 22-24, 2013 , pages 110--112, 2013.
5[5] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM J. Comput. , 28(4):1167--1181, 1999. See also SODA’96.
6[6] Surender Baswana, Akshay Gaur, Sandeep Sen, and Jayant Upadhyay. Distance oracles for unweighted graphs: Breaking the quadratic barrier with constant additive error. In Automata, Languages and Programming, 35th International Colloquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I: Tack A: Algorithms, Automata, Complexity, and Games , pages 609--621, 2008.
7[7] Surender Baswana, Vishrut Goyal, and Sandeep Sen. All-pairs nearly 2-approximate shortest paths in I time. Theor. Comput. Sci. , 410(1):84--93, 2009.
8[8] Surender Baswana and Telikepalli Kavitha. Faster algorithms for all-pairs approximate shortest paths in undirected graphs. SIAM Journal on Computing , 39(7):2865--2896, 2010.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Additive Spanners and Distance Oracles in Quadratic Time

Abstract

1 Introduction

Related work

Techniques

Approximate Distance Oracles and All Pairs Almost Shortest Paths

Preliminaries

Overview

2 Clustering

Definition 1**.**

Lemma 1**.**

Proof.

Lemma 2**.**

Proof.

Lemma 3**.**

Proof.

3 Constructing O(1)O(1)O(1)-Spanners

Theorem 1**.**

Proof.

Lemma 4**.**

Proof.

Lemma 5**.**

Proof.

Lemma 6**.**

Proof.

Theorem 2**.**

4 Distance Oracles

Lemma 7** ([23]).**

Theorem 3**.**

Acknowledgements.

Definition 1.

Lemma 1.

Lemma 2.

Lemma 3.

3 Constructing $O(1)$ -Spanners

Theorem 1.

Lemma 4.

Lemma 5.

Lemma 6.

Theorem 2.

Lemma 7 ([23]).

Theorem 3.