Sensitive Distance and Reachability Oracles for Large Batch Updates
Jan van den Brand, Thatchaphol Saranurak

TL;DR
This paper introduces a novel sensitive distance oracle for large directed graphs that efficiently handles multiple batch updates, improving upon previous algorithms by leveraging polynomial matrix kernel basis decomposition.
Contribution
It presents the first sensitive distance oracle capable of handling f ≥ log n updates with improved preprocessing, update, and query times, utilizing advanced polynomial matrix techniques.
Findings
Handles f ≥ log n updates efficiently
Uses kernel basis decomposition of polynomial matrices
Achieves truly subquadratic update and query times
Abstract
In the sensitive distance oracle problem, there are three phases. We first preprocess a given directed graph with nodes and integer weights from . Second, given a single batch of edge insertions and deletions, we update the data structure. Third, given a query pair of nodes , return the distance from to . In the easier problem called sensitive reachability oracle problem, we only ask if there exists a directed path from to . Our first result is a sensitive distance oracle with preprocessing time, update time, and query time where the parameter can be chosen. The data-structure requires bits of memory. This is the first algorithm that can handle updates. Previous results (e.g.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Cryptography and Data Security · Optimization and Search Problems
Sensitive Distance and Reachability Oracles for Large Batch Updates
Jan van den Brand KTH Royal Institute of Technology, [email protected].
Thatchaphol Saranurak Toyota Technological Institute at Chicago, [email protected]. Work mostly done while at KTH Royal Institute of Technology.
Abstract
In the sensitive distance oracle problem, there are three phases. We first preprocess a given directed graph with nodes and integer weights from . Second, given a single batch of edge insertions and deletions, we update the data structure. Third, given a query pair of nodes , return the distance from to . In the easier problem called sensitive reachability oracle problem, we only ask if there exists a directed path from to .
Our first result is a sensitive distance oracle with preprocessing time, update time, and query time where the parameter can be chosen. The data-structure requires bits of memory. This is the first algorithm that can handle updates. Previous results (e.g. [Demetrescu et al. SICOMP’08; Bernstein and Karger SODA’08 and FOCS’09; Duan and Pettie SODA’09; Grandoni and Williams FOCS’12]) can handle at most 2 updates. When , the only non-trivial algorithm was by [Weimann and Yuster FOCS’10]. When , our algorithm simultaneously improves their preprocessing time, update time, and query time. In particular, when , their update and query time is , while our update and query time are *truly subquadratic *in , i.e., ours is faster by a polynomial factor of . To highlight the technique, ours is the first graph algorithm that exploits the kernel basis decomposition of polynomial matrices by [Jeannerod and Villard J.Comp’05; Zhou, Labahn and Storjohann J.Comp’15] developed in the symbolic computation community.
As an easy observation from our technique, we obtain the first sensitive reachability oracle can handle updates. Our algorithm has preprocessing time, update time, and query time. This data-structure requires bits of memory. Efficient sensitive reachability oracles were asked in [Chechik, Cohen, Fiat, and Kaplan SODA’17]. Our algorithm can handle any constant number of updates in constant time. Previous algorithms with constant update and query time can handle only at most updates. Otherwise, there are non-trivial results for , though, with query time by adapting [Baswana, Choudhary and Roditty STOC’16].
1 Introduction
In the sensitive distance oracle problem111In literature this setting is also referred to as distance sensitivity oracle or emergency algorithm., there are three phases. First, we preprocess a given directed graph with nodes, edges, and integer weights from . Second, given a single batch of edge insertions and deletions, we update the data structure. Third, given a pair of nodes , return the distance from to . The time taken in each phase is called preprocessing time, update time, and query time respectively. In an easier problem called sensitive reachability oracle problem,* *the setting is the same except that we only ask if there exists a directed path from to .
Although both problems are well-studied (as will be discussed below), all existing non-trivial algorithms only handle updates. In contrast, in the analogous problems in undirected graphs, algorithms that handle updates of any size are known. For example, there are sensitive connectivity oracles [PT07, DP10, DP17, HN16] (i.e. reachability oracles in which is undirected graphs) with preprocessing time and update and query time for *every *. For sensitive distance oracles in undirected graphs (studied in [BK13, CLPR12, CCFK17]), Chechik, Langberg, Peleg, and Roditty [CLPR12] show that this is possible as well if approximate distance is allowed. It is interesting whether there is any inherent barrier for in directed graphs. In this paper, we show that there is no such barrier.
Sensitive distance oracle.
All previous works on this problem handle edge deletions only222Node deletions can be handled as well using a simple reduction.. The first result is in the case when by Demetrescu et al. [DTCR08]333The result was published in 2008, but previously announced at SODA and ISAAC 2002. since 2002. Their algorithm has preprocessing time and update and query time. This preprocessing time is improved by Bernstein and Karger to [BK08] and finally to [BK09]. The bound is optimal up to a sub-polynomial factor unless there is a truly subcubic algorithm for the all-pairs shortest path problem. Later, Grandoni and Williams [GW12] showed a new trade-off. For example, they can obtain preprocessing time and update and query time.
It was stated as an open question in [DTCR08, BK08, BK09] whether there is a non-trivial algorithm for handling more than one deletion. Duan and Pettie [DP09] give an affirmative answer for . Their algorithm has preprocessing time and update and query time. They however stated that “it is practically infeasible” to extend their algorithm to handle three deletions. Weimann and Yuster [WY13] later showed an algorithm for handling up to deletions. For any parameter , their algorithm has preprocessing time and update and query time. This algorithm remains the only non-trivial algorithm when .
To summarize, no previous algorithm can handle updates. Another drawback of all previous algorithms is that they inherently cannot handle changes of edge weights.444One restricted and inefficient approach for handling weight increases via edge deletions is as follows. For each edge , we need to know all possible weights for , say . Then, we add multi-edges with the same endpoints as , and the weight of is . To increase the weight from to , we delete all edges . However, all previous algorithms handle only small number deletions. In the setting where we allow changes of edge weights, it is quite natural to ask for algorithms that handle somewhat large batches of updates.555For example, in a road network, there usually are not too many completely blocked road sections (i.e. deleted edges), but during rush-hour there can be many roads with increased congestion (i.e. edges with increased weight).
In this work, we show the first sensitive distance oracle that can handle any number of updates. Moreover, we allow both edge insertions, deletions, and changes of weights. Ours is the fastest oracle for handling updates when :
Theorem 1.1**.**
For any parameter , there is a Monte Carlo data structure that works in three phases as follows:
Preprocess a directed graph with nodes and integer weights from in time and store bits. 2. 2.
Given updates to any set of edges, update the data structure in time and store additional bits. 3. 3.
Given a pair of nodes , return the distance from to or report that there is a negative cycle in time.
The algorithm is Monte Carlo randomized and is correct with high probability, i.e. the algorithm may return a larger distance with probability at most for any constant . We note that, any algorithm that can handle edge deletions can also handle node deletions by using a standard reduction.
Let us compare this result to the algorithm by Weimann and Yuster [WY13]. First, instead of edge deletions only, we allow both edge insertions, deletions, and changes of weight. The second point is efficiency. For any , we improve the preprocessing time from to . Both of our update and query time are bounded above by . When , this improves their bound of for any by a polynomial factor. In particular, when and , their bound is , while ours is* truly subquadratic *in . The last and most conceptually important point is that we remove the constraint that . While their technique is inherently for small , our approach is different and can handle any number of updates.
To highlight our technique, this is the first graph algorithm that exploits the powerful kernel basis decomposition of polynomial matrices by Jeannerod and Villard [JV05] and Zhou, Labahn and Storjohann [ZLS15] from the symbolic computation community. We explain the overview of our algebraic techniques in Section 1.1. We then compare our techniques to the previous related works in Section 1.2.
Sensitive reachability oracle.
It was asked as an open problem by Chechik et al. [CCFK17] whether there is a much more efficient data structure if the query is only about reachability between nodes and not distance. As an easy observation from our technique, we give a strong affirmative answer to this question:
Theorem 1.2**.**
There is a Monte Carlo sensitive reachability oracle that preprocess an -node graph in time and stores bits. Then, given a set of edge insertions/deletions and node deletions, update the data structure in time and store additional bits. Then, given a query , return if there is directed path from to in time.
Previously, there are only algorithms that handle edge deletions based on dominator trees [LT79, BGK*+*08, GT12, FGMT14] or edge deletions [Cho16]. These algorithms have update and query time. Another approach based on fault-tolerant subgraphs666The goal is to find sparse subgraphs preserving reachability information of the original graph even after some edges are deleted. gives algorithms with query time at least [BCR15, BCR16] and becomes trivial when . When , our algorithm has update and query time like the algorithms using the first approach. Moreover, ours is the first which handles updates of any size.
It was shown in [AW14, HLNW17] that, assuming the Boolean Matrix Multiplication conjecture, there cannot be any constant and a “combinatorial” algorithm for Theorem 1.2 which has preprocessing time, and can handle edge insertions using update and query time. Our result does not refute the conjecture as we use fast matrix multiplication.
1.1 Technical overview
Set the stage.
The first step of all our results is to reduce the problem on graphs to algebraic problems using the following known reduction (see Lemma 2.1 for a more detailed statement). Let be an -node graph with integer weights from . Let be a finite field of size at least . We construct a polynomial matrix such that where is the weight of edge and is a random element from . Then, with high probability, we can read off the distance from to in from the entry of the adjoint matrix of , for all pairs . That is, it suffices to build a data structure on the above polynomial matrix that can handle updates and can return an entry of its adjoint . Updating edges in corresponds to adding with where has non-zero entries. Further we have with high probability that , so it is enough to focus on algorithms that work on non-singular matrices. From now, we let be an arbitrary field and we use the number of field operations as complexity measure.
Warm up: slow preprocessing.
To illustrate the basic idea how to maintain the adjoint, we will prove the following:
Lemma 1.3**.**
Let be a polynomial matrix of degree with , then there exists an algorithm that preproceses in operations. Then, for any with non-zero entries of degree at most , we can query any entry of in operations, if .
If , then the preprocessing and query time are and respectively.
This immediately implies a weaker statement of Theorem 1.1 when and the edge updates and the pair to be queried are given at the same time. We remark that, from the simple proof below, this already gives us the first non-trivial sensitive distance oracle which can handle any number of updates. Previous techniques inherently require . As the reachability problem can be considered a shortest path problem, where every edge has weight zero, we also obtain a result similar to Theorem 1.2 from Lemma 1.3 for .
Corollary 1.4**.**
Let be some directed graph with integer weights in , then there exists an algorithm that preprocess in time. Then, given edge updates to and a query pair , it returns the distance from to in the updated graph in time.
Corollary 1.5**.**
Let be some directed graph, then there exists an algorithm that preprocess in time. Then, given edge updates to and a query pair , it returns the reachability from to in the updated graph in time.
To prove Lemma 1.3, we use the key equality below based on the Sherman-Morrison-Woodbury formula. The proof is deferred to Appendix A.
Lemma 1.6**.**
Let be an matrix and be matrices, such that . Define the matrix . Then, we have
[TABLE]
The algorithm for Lemma 1.3 is as follows: We preprocess by computing and . As and have degree at most , this takes field operations [BCS97, Chapter 1] (or just if ). Next, we write where have only one non-zero entry of degree per column. To compute , we simply compute
[TABLE]
where is the -th standard unit vector. This computation can be separated into the following steps:
Compute , and . Note that because of the sparsity of and , and are essentially just vectors of elements of , each multiplied by a non-zero element of and . Thus this step can be summarized as obtaining elements of . 2. 2.
Compute which are likewise just elements of , each multiplied by a non-zero element of and . So this time we have to obtain entries of . 3. 3.
Compute the adjoint and determinant . 4. 4.
Compute . 5. 5.
Compute and vector-matrix-vector product and divide it by . Then subtract the two values and we obtain .
Steps 1 and 2 require only field operations as we just have to read entries of and multiply them by some small -degree polynomials from and . In step 3 we have to compute the adjoint and determinant of a matrix of degree . This takes operations. Step 4 computes where is of degree , which takes operations. Step 5 takes because is a degree matrix of dimension . The total number of operations is thus . The algorithm does not require the upper bound of as in [WY13].
For the reachability case, when , all entries of the matrices and vectors are just field elements, so steps 1 and 2 need only operations. Step 3 needs operations, while 4 can be done in just operations, and the last step 5 requires only operations.
Key technique: kernel basis decomposition.
The biggest bottleneck in Lemma 1.3 is explicitly computing and . For it already takes operations in the preprocessing step just to write down the entries of each of which has degree upto . What we need is an adjoint oracle, i.e. a data structure on with fast preprocessing that can still quickly answer queries about entries of .
By replacing this data structure in the five steps of the proof of Lemma 1.3, this immediately gives distance oracles in the sensitive setting which dominate previous results when . (This is how we obtain Theorem 1.1).
The key contribution of this paper is to realize that the technique in [JV05, ZLS15] actually gives the desired adjoint oracle. This technique, which we call the kernel basis decomposition, is introduced by Jeannerod and Villard [JV05] and then improved by Zhou, Labahn, and Storjohann [ZLS15]. It is originally used for inverting a polynomial matrix of degree in operations. However, the following adjoint oracle is implicit in Section 5.3 of [ZLS15]777Section 5.3 of [ZLS15] discusses computing , where the result is given by a vector with entries of the form where are polynomials of degree at most . Since can be computed in [Sto03, LNZ17] and , we get Theorem 1.7. :
Theorem 1.7**.**
There is a data-structure that preprocesses where and in operations. Then, given any where , it can compute in operations.
However, to get query time as in Theorem 1.1, Theorem 1.7 is not enough. Fortunately, by modifying the technique from [ZLS15] in a white-box manner (see Section 3 for details), we can obtain the following trade-off which is essential for Theorem 1.1. The result essentially interpolates the exponents of the following two extremes: preprocessing and query time when computing the adjoint explicitly, or preprocessing and query time when using Theorem 1.7.
Theorem 1.8**.**
For any , there is a data-structure that preprocesses where and in operations. Then, given any pair , it returns in operations.
To see the main idea, we give a slightly oversimplified description of the oracle in Theorem 1.7 which allows us to show how to modify the technique to obtain Theorem 1.8. Below, we write a number for each matrix entry to indicate a bound on the degree, e.g. when we write then we mean a 3-dimensional vector with entries of degree at most 4.
Suppose that we are now working with an matrix of degree . Then [ZLS15] (and [JV05] for a special type of matrices) is able to find a full-rank matrix of degree in field operations, such that
[TABLE]
Here the empty sections of the matrices represent zeros in the matrix and the s represent entries of degree at most . This means the left part of is a kernel-base of the lower part of and likewise the right part of is a kernel-base of the lower part of .
This procedure can now be repeated on the two smaller matrices of degree . After such iterations we have:
[TABLE]
We call this chain the kernel basis decomposition of . Here, , and each consists of block matrices on the diagonal of dimension and degree . So while the degree of these blocks doubles, the dimension is halved, which implies that all these can be computed in just operations.
Observe that the inverse can be written as . Also, is a diagonal matrix, and so is easily invertible, i.e. we can write the entries of the inverse in the form of rationals where both and are of degree . Therefore, we can represent the adjoint via
To compute for any degree vector in operations, we must compute from left to right. Each vector matrix product with some has degree but at the same time the dimension of the diagonal blocks is only , hence each product requires only field operations. Scaling by and dividing by the entries of also requires only as their degrees are bounded by . This gives us Theorem 1.7.
The idea of Theorem 1.8 is to explicitly precompute a prefix from the factors of . This increases the preprocessing time but at the same time allows us to compute faster.
1.2 Comparison with previous works
Previous dynamic matrix algorithms.
In contrast to algorithms for sensitive oracles that handle a single batch of updates, dynamic algorithms must handle an (infinite) sequence of updates. The techniques we used for our sensitive distance/reachability oracles are motivated from techniques developed for dynamic algorithms which we will discuss below.
There is a line of work initiated by Sankowski [San04, San07, vdBNS19] on maintaining inverse or adjoint of a dynamic matrix whose entries are field elements, not polynomials as in our setting. Let us call such matrix a non-polynomial matrix. By the similar reductions for obtaining applications on weighted graphs in this paper, dynamic non-polynomial matrix algorithms imply solutions to many dynamic algorithms on unweighted graphs.
Despite the similarity of the results and applications, there is a sharp difference at the core techniques of our algorithm for polynomial matrices and the previous algorithms for non-polynomial matrices. The key to all our results is fast preprocessing time. By using the kernel basis decomposition [ZLS15], we do not need to explicitly write down the adjoint of a polynomial matrix, which takes operations if the matrix has size and degree . This technique is specific for polynomial matrix and does not have a meaningful counterpart for a non-polynomial matrix. On the contrary, algorithms for non-polynomial matrices from [San04, San07, vdBNS19] just preprocess a matrix in a trivial way. That is, they compute the inverse and/or adjoint explicitly in operations. Their key contribution is how to handle update in operations.
We remark that Sankowski [San05b] did obtain a dynamic polynomial matrix algorithm by extending previous dynamic non-polynomial matrix algorithms. However, there are two limitations to this approach. First, the algorithm requires a matrix of the form where . This restriction excludes some applications including distances on graphs with zero or negative weights because we cannot use the reduction Lemma 2.1. Second, the cost of the algorithm is multiplied by the degree of the adjoint matrix which is if has degree . Hence, just to update one entry, this takes operations888The current best algorithm [vdBNS19] takes operations to update one entry of a non-polynomial matrix.. This is already slower than the time for computing from scratch an entry of adjoint/inverse using static algorithms999The first limitation explains why there is only one application in [San05b], which is to maintain distances on unweighted graphs. To bypass the second limitation, Sankowski [San05b] “forces” the degrees to be small by executing all arithmetic operations under modulo for some small . A lot of information about the adjoint is lost from doing this. However, for his specific application, he can still return the queried distances by combining with other graph-theoretic techniques. .
Previous sensitive distance oracles.
Previous sensitive distance oracles such as [WY13, GW12] also use fast matrix-multiplication, but only use it for computing a fast min-plus matrix product in a black box manner. All further techniques used by these algorithms are graph theoretic.
Our shift from graph-theoretic techniques to a purely algebraic algorithm is the key that enables us to support large sets of updates. Let us explain why previous techniques can inherently handle only small number of deletions. Their main idea is to sample many smaller subgraphs in the preprocessing. To answer a query in the updated graph, their algorithms simply look for a subgraph where (i) all deleted edges were not even in from the beginning, and (ii) all edges in the new shortest path are in . To argue that exists with a good probability, the number of deletions cannot be more than where is the number of nodes. That is, these algorithms do not really re-compute the new shortest paths, instead they pre-compute subgraphs that “avoid” the updates.
Purely algebraic algorithms such as ours (and also [San05b, vdBNS19, vdBN19]) can overcome the limit on deletions naturally. For an intuitive explanation consider the following simplified example for unweighted graphs: Let be the adjacency matrix of an unweighted graph, then the polynomial matrix has the following inverse when considering the field of formal power series: (this can be seen by multiplying both sides with ). This means the coefficient of of is exactly the number of walks of length from to . So the entry does not just tell us the distance between and , the entry actually encodes all possible walks from to . Thus finding a replacement path, when some edge is removed, becomes very simple because the information of the replacement path is already contained in entry . The only thing we are left to do is to remove all paths from that use any of the removed edges. This is done via cancellations caused by applying the Sherman-Morrison formula.
Our algorithm exploits the adjoint instead of the inverse, but the interpretation is similar since for invertible matrices the adjoint is just a scaled inverse: . We also do not perform the computations over , but to bound the required bit-length to represent the coefficients.
1.3 Organization
We first introduce relevant notations, definitions, and some known reductions in Section 2. We construct the adjoint oracles from Theorems 1.7 and 1.8 based on kernel basis decomposition in Section 3. Finally, we show our algorithms for maintaining adjoint of polynomial matrices in Section 4.1, where we will also apply the reductions to get our distance and reachability oracles Theorems 1.1 and 1.2.
2 Preliminaries
Complexity Measures
Most of our algorithms work over any field and their complexity is measured in the number of arithmetic operations performed over , i.e. the arithmetic complexity. This does not necessarily equal the time complexity of the algorithm as one arithmetic operation could require more than time, e.g. very large rational numbers could require many bits for their representation. This is why our algebraic lemmas and theorems will always state “in operations” instead of “in time”.
For the graph applications however, when having an node graph, we will typically use the field for some prime of order for some constant . This means each field element requires only bits to be represented and all field operations can be performed in time in the standard model (or bit-operations).
Notation: Identity and Submatrices
The identity matrix is denoted by .
Let and be a matrix, then the term denotes the submatrix of consisting of the rows and columns . For some we may also just use the index instead of . The term thus refers to the th column of .
Matrix Multiplication
We denote with the arithmetic complexity of multiplying two matrices. Currently the best bound is [Gal14, Wil12].
Polynomial operations
Given two polynomials with , we can add and subtract the two polynomials in operations in . We can multiply the two polynomials in using fast-fourier-transformations, likewise dividing two polynomials can be done in as well [AHU74, Section 8.3]. Since we typically hide polylog factors in the notation, all operations using degree polynomials from can be performed in operations in .
Polynomial Matrices
We will work with polynomial matrices/vectors, so matrices and vectors whose entries are polynomials. We define for the degree . Note that a polynomial matrix with might not have an inverse in as is a ring. However, the inverse does exist in where is the field of rational functions.
Adjoint of a Matrix
The adjoint of an matrix is defined as . In the case that has non-zero determinant, we have . Note that in the case of being a degree polynomial matrix, we have and .
Graph properties from polynomial matrices
Polynomial matrices can be used to obtain graph properties such as the distance between any pair of nodes:
Lemma 2.1** ([San05a, Theorem 5 and Theorem 7]).**
Let be a field of size for some constant and let be a graph with nodes and integer edge weights .
Let be a polynomial matrix, where and and each is chosen independently and uniformly at random.
- •
If contains no negative cycle, then the smallest degree of the non-zero monomials of minus is the length of the shortest path from to in with probability at least .
- •
Additionally with probability at least , the graph has a negative cycle, if and only if has a monomial of degree less than .
3 Adjoint Oracle
In this section we will outline how the adjoint oracle Theorem 1.7 by [ZLS15] can be extended to our Theorem 1.8.
Unfortunately this new result is not a blackbox reduction, instead we have to fully understand and exploit the properties of the algorithm presented in [ZLS15]. This is why a formally correct proof of Theorem 1.8 requires us to repeat many definitions and lemmas from [ZLS15]. Such a formally correct proof can be found in subsection 3.2. We will start with a high level description based on the high level idea of Theorem 1.7 presented in Section 1.1.
3.1 Extending the Oracle to Element Queries
We will now outline how the data-structure of kernel-bases, presented in Section 1.1, can be used for faster element queries to . Remember that Theorem 1.7 was based on representing , where each consists of diagonal blocks of size and is of degree .
The idea for Theorem 1.8 is very simple: Choose some and such that , then during the pre-processing compute the kernel-base decomposition and pre-compute the product explicitly. When an entry of is required, we only have to compute .
Complexity of the Algorithm
Let be the matrix obtained when setting all of to 0, except for the diagonal block that includes the th column. We will now argue, that . This equality can be seen by computing the product from right to left:
- •
Consider the right-most product . The vector is non-zero only in the th row, so only the th column of matters, hence .
- •
Consider the product . The matrix has few non-zero rows, so most columns of will be multiplied by zero and we thus most entries of do not matter for computing the product. Note that all entries of that do matter (i.e. are multiplied with non-zero entries of ) are inside the block , because of the recursive structure of the matrices (i.e. the blocks of are obtained by splitting the blocks of ), see for instance Figure 1. This leads to .
By induction we now have .
The complexity of computing this product is very low, when multiplied from left to right. Consider the first products . The degree of matrix and are both bounded by . The matrix is 0 except for a block on the diagonal. Hence, this first product requires field operations.
All products after this require fewer operations: On one hand the degree of vector and matrix double after each product, on the other hand the dimension of the non-zero block of is halved. Since the complexity of the vector matrix product scales linearly in the degree but quadratic in the dimension, the complexity is bounded by the initial product . The query complexity is thus .
This is only a rough simplification of how the algorithm works. For instance the degrees of the are not simple powers of 2, instead only the average degree is bounded by a power of 2. Likewise the dimension and the size of the diagonal blocks do not have to be a power of two.
3.2 Formal Proof of the Adjoint Element Oracle
Before we can properly prove our Theorem 1.8, we first have to define/cite some terminology and lemmas from [ZLS15], as our Theorem is heavily based on their result.
First we will define the notation of shifted column degrees. Shifted column degrees can be used to formalize how the degree of a vector changes when multiplying it with a polynomial matrix.
Definition 3.1** ([ZLS15, Section 2.2]).**
Let be some field, be some polynomial matrix and let be some vector.
Then the -shifted column degrees of is defined via:
[TABLE]
Lemma 3.2**.**
Let be some field, be some polynomial matrix and let be some vector. Further let be a polynomial vector where for .
Then and can be computed in .
Proof.
Multiplying requires field operations. Hence the total cost becomes
[TABLE]
∎
We will now give a formal description of the data-structure constructed in [ZLS15]. The following definitions and properties hold throughout this entire section.
Let such that , so bounds the maximum degree in the th column of (also called the column degree of ). Let be the average column degree of .
In [ZLS15] they construct in field operations a chain of matrices and a diagonal matrix such that .
Here the matrices are block matrices consisting each of diagonal blocks, i.e.
[TABLE]
The number of rows/column of each is upto a factor of 2. (Note that refers to the th block on the diagonal of , not to be confused with our earlier definition of in subsection 3.1).
Remember from the overview (Section 1.1) that each consists of two kernel bases, so each of these diagonal block matrices consists in turn of two matrices (kernel bases)
[TABLE]
Here and are not variables but denote the left and right submatrix.
We also write for the partial product . Each can be decomposed into
[TABLE]
where each has rows and the number of columns in corresponds to the number of columns in . We can compute as follows:
[TABLE]
We have the following properties for the degrees of these matrices:
Lemma 3.3** (Lemma 10 in [ZLS15]).**
Let be the dimension of , let and be the number of columns in and respectively and let , then
- •
, and
- •
* and *
(which also implies )
Lemma 3.4** (Lemma 11 in [ZLS15]).**
For a given and the matrix multiplications in (2) can be done in field operations.
We now have defined all the required lemmas and notation from [ZLS15] and we can now start proving Theorem 1.8.
The following lemma is analogous to Lemma 3.4, though now we want to compute only one row of product (2). This lemma will bound the complexity of the query operation in Theorem 1.8.
Lemma 3.5**.**
Let be some row of , then we can compute for both in .
Proof.
The matrices and form the matrix , so we instead just compute the product . The matrix is of size (up to a factor of 2). Let then by Lemma 3.3 we know and .
By Lemma 3.2 the cost of computing is , which given the degree bounds can be simplified to . ∎
The following lemma will bound the complexity for the pre-processing of Theorem 1.8.
Lemma 3.6**.**
If we already know the matrix for , then for any we can compute in field operations.
Proof.
To compute , requires to compute all many for . Assume we already computed matrix , then we can compute for via Lemma 3.4. Inductively the total cost we obtain is:
[TABLE]
∎
The last lemma we require for the proof of Theorem 1.8 is that we can compute the determinant of in field operations.
Lemma 3.7** ([Sto03, LNZ17]).**
Let be a matrix of degree at most , then we can compute in field operations.
Proof of Theorem 1.8.
The claim is that for any we can, after pre-processing of , compute any entry in operations.
Pre-processing
We first compute the determinant via Lemma 3.7 and construct the chain of matrices as in [ZLS15] in , then we compute in using Lemma 3.6.
Queries
When answering a query for we compute one entry , multiply it with and divide it by , because .
Here the expensive part is to compute the entry of , which is done by computing one row of for some appropriate . Via (2), we know
[TABLE]
for some sequence , .
So we only have to compute the product of the th row of with a sequence of matrices. Computing this product from left to right means we compute the following intermediate results
[TABLE]
for every . So each intermediate result is just the th row of some matrix . This means such a vector-matrix product can be computed via Lemma 3.5 in
[TABLE]
Here the first product for is the most expensive and the total cost for all many vector-matrix products becomes .
∎
4 Sensitive Distance and Reachability Oracles
In this section we will use the adjoint oracle from Section 3 to obtain the results presented in Section 1. A high-level description of that algorithm was already outlined in Section 1.1.
This section is split into two parts: First we will describe in Section 4.1 our results for maintaining the adjoint of a polynomial matrix, which will conclude with the proof of Theorem 1.1. The second subsection 4.2 will explain how to obtain the sensitive reachability oracle Theorem 1.2. These graph theoretic results will be stated more accurately than in the overview by adding trade-off parameters and memory requirements.
All proofs in this section assume that the matrix stays non-singular throughout all updates, which is the case w.h.p for matrices constructed via the reduction of Lemma 2.1.
4.1 Adjoint oracles
Theorem 4.1**.**
Let be some field and be a polynomial matrix with of degree . Then for any we can create a data-structure storing field elements in field operations, such that:
We can change columns of and update our data-structure in field operations storing additional field elements. This updated data-structure allows for querying entries of in field operations and queries to the determinant of the new in operations.
Proof.
We start with the high-level idea: We will express the change of entries of via the rank update , where and both matrices have only one non-zero entry per column. Via Lemma 1.6 we have for that
[TABLE]
Our algorithm is as follows:
Initialization
During the initialization use Theorem 1.8 on matrix in operations, which will allow us later to query entries in operations. We also compute via Lemma 3.7 in operations.
Updating the data-structure
The first task is to compute the matrix . Note that for element updates to , the matrices and have only one non-zero entry per column, so is just an submatrix of where each entry is multiplied by one non-zero entry of and . Thus we can compute in operations, thanks to the pre-processing of . Next, we pre-processing the matrix using Theorem 1.7, which requires as the degree of is bounded by . We also compute in operations.
Thus the update complexity is bounded by operations.
Querying entries
To query an entry of we have to compute
[TABLE]
Here can be computed in operations. The vectors and are just entries of where each entry is multiplied by one non-zero entry of or , so we can compute them in operations. Multiplying one of these vectors with requires operations because of the pre-processing of via Theorem 1.7. The product of with can be computed in operations. Subtracting from the product and dividing by can be done in operations.
The query complexity is thus operations.
Maintaining the determinant
We have (as can be seen in the proof of Appendix A). The division requires only operations.
∎
In this section we will state the result from Section 1 in a more formal way. We will state trade-off parameters, memory consumption and we will separate the sensitive setting into two phases: update and query.
Theorem 4.2** (Corresponds to Theorem 1.1).**
Let be a graph with nodes and edge weights in . For any we can create in time a Monte Carlo data-structure that requires bits of memory, such that:
We can then change edges (additions and removals) and update our data-structure in time using additional bits of memory. This updated data-structure allows for querying the distance between any pair of nodes in time, or report that there exists a negative cycle in time.
Proof of Theorem 4.2.
We construct the matrix as specified in Lemma 2.1. Since the field size is polynomial in , the arithmetic operations can be executed in time in the standard model and saving one field element requires bits.
The determinant of the matrix is computed during the update, where we will also check for negative cycles by checking if the determinant has a non-zero monomial of degree less than . This way we can answer the existence of a negative cycle in time, when a query to a distance is performed.
Theorem 4.2 is now implied by Theorem 4.1.
∎
4.2 Sensitive reachability oracle
So far we have only proven the distance applications. Using the same techniques the result can also be extended to a sensitive reachability oracle.
Reachability can be formulated as a distance problem where every edge has cost 0. In that case the matrix constructed in Lemma 2.1 has degree 0, so the matrix is in and we do not have to bother with polynomials anymore. The algorithm for the following result is essentially the same as Theorem 4.1, but since we no longer have polynomial matrices, we no longer require more sophisticated adjoint oracles and can instead use simpler subroutines.
Theorem 4.3** (Corresponds to Theorem 1.2).**
Let be a graph with nodes. We can construct in time a Monte Carlo data-structure that requires bits of memory, such that:
We can change any edges (additions and removals) and update our data-structure in time using additional bits of memory. This updated data-structure allows for querying the reachability between any pair of nodes in time.
Proof.
We will now construct a data-structure that, after some initial pre-processing of some matrix , allows us to quickly query elements of for any sparse where and have at most one non-zero entry per column.
Pre-processing
We compute and explicitly in field operations.
Update
We receive matrices , where each column has only one non-zero entry. We compute in operations because of the sparsity of and . This means we can now compute for the matrix and in operations.
Query
When querying an entry of we have to compute:
[TABLE]
The vectors and can be computed in operations as they are just entries of the adjoint, each multiplied by one non-zero entry of or . The product can be computed in operations.
Thus the entry of can be computed in operations.
∎
5 Open Problems
We present the first graph application of the kernel basis decomposition by [JV05, ZLS15], and we are interested in seeing if there are further uses of that technique outside of the symbolic computation area.
Our sensitive distance oracle has subqubic preprocessing time and subquadratic query time. When restricting to only a single edge deletion, Grandoni and V. Williams where able to improve this to subqubic preprocessing and sublinear query time [GW12]. An interesting open question is whether a similar result can be obtained for multiple edge deletions. Even supporting just two edge deletions with subqubic preprocessing and sublinear query time would a good first step. Alternatively, disproving the existence of such a data-structure would also be interesting. Currently the best (conditional) lower bound by V. Williams and Williams [WW18, HLNW17] refutes such an algorithm, if its complexity has a dependency on the largest edge weight . For algorithms with dependency no such lower bound is known.
All our data-structures maintain only the distance, or in case of reachability, return a boolean answer. So another open problem would be to find a data-structure, that does not just return the distance, but also the actual path.
Finally, our data-structures are randomized Monte-Carlo. We wonder if there is a deterministic equivalent. In case of the previous result by Weimann and Yuster [WY13], a deterministic equivalent was found by Alon et al. [ACC19].
Acknowledgment
This project has received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme under grant agreement No 715672. This work was mostly done while Thatchaphol Saranurak was at KTH Royal Institute of Technology.
Appendix A Proof of Lemma 1.6
Proof of Lemma 1.6.
Via the Sherman-Morrison-Woodbury formula we have
[TABLE]
and via the Sylvester determinant identity we have
[TABLE]
This allows us to write the determinant of as follows:
[TABLE]
Which yields because . Likewise we obtain:
[TABLE]
Thus we obtain Lemma 1.6 by multiplying the Sherman-Morrison-Woodbury identity with . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ACC 19] Noga Alon, Shiri Chechik, and Sarel Cohen. Deterministic combinatorial replacement paths and distance sensitivity oracles. In ICALP , volume 132 of LIP Ics , pages 12:1–12:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2019.
- 2[AHU 74] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms . Addison-Wesley, 1974.
- 3[AW 14] Amir Abboud and Virginia Vassilevska Williams. Popular conjectures imply strong lower bounds for dynamic problems. In FOCS , pages 434–443. IEEE Computer Society, 2014.
- 4[BCR 15] Surender Baswana, Keerti Choudhary, and Liam Roditty. Fault tolerant reachability for directed graphs. In DISC , volume 9363 of Lecture Notes in Computer Science , pages 528–543. Springer, 2015.
- 5[BCR 16] Surender Baswana, Keerti Choudhary, and Liam Roditty. Fault tolerant subgraph for single source reachability: generic and optimal. In STOC , pages 509–518. ACM, 2016.
- 6[BCS 97] Peter Bürgisser, Michael Clausen, and Mohammad Amin Shokrollahi. Algebraic complexity theory , volume 315 of Grundlehren der mathematischen Wissenschaften . Springer, 1997.
- 7[BGK + 08] Adam L. Buchsbaum, Loukas Georgiadis, Haim Kaplan, Anne Rogers, Robert Endre Tarjan, and Jeffery Westbrook. Linear-time algorithms for dominators and other path-evaluation problems. SIAM J. Comput. , 38(4):1533–1573, 2008.
- 8[BK 08] Aaron Bernstein and David R. Karger. Improved distance sensitivity oracles via random sampling. In SODA , pages 34–43. SIAM, 2008.
