Near-Optimal Compression for the Planar Graph Metric
Amir Abboud, Pawel Gawrychowski, Shay Mozes, Oren Weimann

TL;DR
This paper introduces near-optimal compression schemes for planar graph metrics, significantly advancing understanding of distance encoding and revealing the complexity added by weights in planar graphs.
Contribution
It presents a new compression method for planar graph metrics that is optimal up to log factors and challenges existing lower bounds, especially regarding weighted graphs.
Findings
Achieves a compression bound of O(rac{k^2, \u221a{k \u00d7 n}}) bits, nearly matching the lower bounds.
Breaks previous lower bounds for compression using minors and weighted planar graphs.
Designs a Subset Distance Oracle with O(sqrt{k an}) space and O(n^{3/4}) query time.
Abstract
The Planar Graph Metric Compression Problem is to compactly encode the distances among nodes in a planar graph of size . Two na\"ive solutions are to store the graph using bits, or to explicitly store the distance matrix with bits. The only lower bounds are from the seminal work of Gavoille, Peleg, Prennes, and Raz [SODA'01], who rule out compressions into a polynomially smaller number of bits, for {\em weighted} planar graphs, but leave a large gap for unweighted planar graphs. For example, when , the upper bound is and their constructions imply an lower bound. This gap is directly related to other major open questions in labelling schemes, dynamic algorithms, and compact routing. Our main result is a new compression of the planar graph metric into bits, which is optimal up…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Advanced Graph Theory Research · Optimization and Search Problems
Near-Optimal Compression for the Planar Graph Metric††thanks: Supported in part by Israel Science Foundation grant 794/13.
Amir Abboud Stanford University, Department of Computer Science, [email protected].
Paweł Gawrychowski University of Haifa, Department of Computer Science, [email protected].
Shay Mozes Interdisciplinary Center Herzliya, Efi Arazi School of Computer Science, [email protected].
Oren Weimann University of Haifa, Department of Computer Science, [email protected].
Abstract
The Planar Graph Metric Compression Problem is to compactly encode the distances among nodes in a planar graph of size . Two naïve solutions are to store the graph using bits, or to explicitly store the distance matrix with bits. The only lower bounds are from the seminal work of Gavoille, Peleg, Prennes, and Raz [SODA’01], who rule out compressions into a polynomially smaller number of bits, for weighted planar graphs, but leave a large gap for unweighted planar graphs. For example, when , the upper bound is and their constructions imply an lower bound. This gap is directly related to other major open questions in labelling schemes, dynamic algorithms, and compact routing.
Our main result is a new compression of the planar graph metric into bits, which is optimal up to log factors. Our data structure breaks an lower bound of Krauthgamer, Nguyen, and Zondiner [SICOMP’14] for compression using minors, and the lower bound of Gavoille et al. for compression of weighted planar graphs. This is an unexpected and decisive proof that weights can make planar graphs inherently more complex. Moreover, we design a new Subset Distance Oracle for planar graphs with space, and query time.
Our work carries strong messages to related fields. In particular, the famous vs. gap for distance labelling schemes in planar graphs cannot be resolved with the current lower bound techniques.
1 Introduction
The shortest path metric of planar graphs is one of the most popular and well-studied metrics in Computer Science. Countless papers, surveys, and textbooks address the computational challenges that arise when dealing with it. In this paper, we address a core problem about this metric that has remained poorly understood. We ask: How compressible is it? That is, how many bits do we need, information theoretically, in order to describe a set of distances in a planar graph?
As we discuss shortly, a better understanding of this core question is crucial to making progress on some of the biggest open problems in other well-studied subjects such as Sparsification, Labeling Schemes, Dynamic Algorithms, and Compact Routing Schemes.
First, let us define our problem more formally. In the Metric Compression problem, we are given a set of points in some metric space with distance function , such as the metric of distances in an node planar graph, and the goal is to find an encoding that is as short as possible, yet still allows us to compute for any two points .
Definition 1.1** (The Planar Graph Metric Compression Problem).**
Given an unweighted, undirected planar graph on nodes, and a subset of distinguished nodes in , compute a bit string that encodes the distances between all pairs of nodes in . That is, there is a decoding function that given the encoding and any two nodes returns the -to- distance in (i.e., ).
There are two naïve ways to solve this problem. First, we can store all the distances explicitly as a matrix in the encoding . The distance in a graph on nodes is some number in , and so this matrix can be encoded using bits. The second option, which is better whenever , is to let the encoding be the graph itself. Naïvely, this is bits, and more sophisticated encodings give [93, 78, 34, 23]. Using the notation to hide polylog factors, we get a naïve upper bound of for our problem. Is this the best possible, or could there be a much better compression into or even bits?
For context, let us look at other metrics. One example of a metric that admits an ultra-efficient compression into bits is the metric of trees or bounded treewidth graphs [32, 48]. For most metrics of interest, however, the exact or lossless version of the compression problem is too difficult and no non-trivial upper bounds, beyond log-factor improvements, are possible. For example, in general (non-planar) graphs there is a simple lower bound: in any compression, each of the possible graphs on nodes must be encoded differently. Instead, it is popular to seek the optimal lossy compression from which the metric can be recovered approximately, e.g. up to a multiplicative error. For example, the classical Johnson-Lindenstrauss [60, 10] embedding allows one to compress a set of points in Euclidean -dimensional space into roughly bits, so that the distances between the points can be recovered up to a factor, and a recent breakthrough of Indyk and Wagner [58] reduced the bound to roughly which is tight up to a factor.
Indeed, if we are willing to pay a error, there are ingenious compressions of the planar graph metric into bits [88, 70, 66]. But do we have to pay this error, or are planar graphs restricted enough to allow for non-trivial compression?
Open Question 1**.**
Can we beat bits for planar graph metric compression?
There are some lower bounds in our way. From the seminal work of Gavoille, Peleg, Pérennes, and Raz [48] we know that the metric of weighted planar graphs, where the edge weights are polynomially bounded, does not admit any non-trivial compression. The authors show that any Boolean matrix can be “encoded” using the distances among a set of nodes in a weighted planar graph on nodes, where the edge weights are in . Since we cannot compress an arbitrary matrix into bits, we get a nearly-tight lower bound of for weighted planar graphs. For unweighted planar graphs, Gavoille et al. simply subdivide the edges in their construction and the number of nodes in the encoding grows to , which leads to a much weaker lower bound of (see Section 4 for more details). For example, when , the upper bound is and the lower bound is . This subdivision of edges is rather naïve, and the overall lower bound construction does not seem to capture the full power of the planar graph metric. In fact, it can be simulated by a grid graph [4]. This naturally suggests the following intriguing challenge of finding a more clever encoding of matrices into planar graphs, which would lead to a negative resolution to Open Question 1.
Challenge 1**.**
Can we encode an arbitrary Boolean matrix using the distances among a subset of nodes in an unweighted planar graph with vertices, so that we can determine by only looking at the distance between and in our graph?
Before presenting our results, let us discuss the state of the art on questions that are closely related to ours, in which we are interested in data structures that are not only as succinct as possible, but also have other desirable features. Along the way, we give further reasons to be pessimistic about the possibility of a non-trivial compression.
Sparsification.
A natural way to compress a graph is by deleting or contracting some of its edges and nodes. Finding small subgraphs or minors that preserve or approximate the distances among a given subset of nodes have been studied for planar graphs [56, 31, 42, 22, 43, 72, 33, 53, 52, 73] and for general graphs [26, 36, 96, 83, 38, 65, 80, 61, 64, 25, 2, 1, 24]. Such compressions are appealing algorithmically, since we can readily feed them into our usual graph algorithms, and recent research suggests that, in many settings, near-optimal compression bounds can be achieved using such sparsifiers (e.g. when compressing general graphs with additive error [1, 3]). A discouraging lower bound of Krauthgamer, Nguyen, and Zondiner [72] shows that even in the case of unweighted grid graphs, it is impossible to beat the naïve bound using a (possibly weighted) minor. Thus, a positive answer to Open Question 1 will have to involve a more complicated data structure.
Labelling Schemes.
An appealing way to represent graphs is to assign a label to each node , so that by looking at the labels of two nodes we can infer certain properties such as the distance between them . Finding so-called distance labelling schemes in which the labels are as short as possible is a classical subject of study [54, 62, 48, 82]. Such labels are used for efficient algorithms both in theory [89, 8] and practice [40]. An inspiring lecture by Stephen Alstrup at HALG 2016 surveys breakthroughs [16, 19, 17] achieved in this field in the last few years, all of which involve shaving constants or logarithmic factors. A famous open question is to close the rare polynomial gap in the bounds for planar graphs that has been embarrassingly open since the work of Gavoile et al. [48]: the upper bound is bits per label (due to [51] who shaved a log factor over [48]), and the lower bound is . The only known technique to prove polynomial lower bounds111The only result that somewhat deviate from this technique are lower bound for nearest common ancestors in trees [18] and lower bound for distance in trees [48, 17]. The gist of both of them is being able to argue about how much information can be shared by labels of two nodes. If the graph is not a tree, this seems very challenging. is to argue that labelling schemes are one way to compress graphs, and then use facts about the limits of graph compression. For example, the lower bound for distance labelling of planar graphs [48] follows because labels of size can be used to solve the metric compression problem using bits, which contradicts the lower bounds above. In fact, the tight lower bound for metric compression of weighted planar graphs leads to a tight lower bound for labelling schemes [48, 4]. Thus, to prove a tight lower bound of for labelling schemes in unweighted planar graphs, the only approach we have with current techniques is to negatively resolve Open Question 1, e.g. by accomplishing Challenge 1.
Routing and Dynamic Algorithms.
A compact routing scheme assigns names and tables to the nodes of a graph, so that each node can find out the first edge on the shortest path (or some approximate path) to any target node only using the name of and the local table stored at . There is a vast literature on the topic, seeking the best possible tradeoff between sizes of the tables and the stretch in many different graph families (we refer the reader to Peleg’s book [81] and the extensive surveys [46, 47]). For planar graphs, Abraham, Gavoille, and Malkhi [9] write: “Surprisingly, for stretch 1, the complexity of the size of the routing tables is not known.” A simple upper bound is total table size, and an adaptation of the same Gavoille et al. construction gives a lower bound of [9]. It is likely that accomplishing Challenge 1 would resolve this gap as well. Yet another problem with similar state-of-the-art is the All Pairs Shortest Paths problem in dynamic planar graphs. Here, the goal is to have a data structure that supports efficient updates to the graph (edge additions or removals), and can answer shortest path queries efficiently. The breakthrough algorithm of Fakcharoenphol and Rao [45], and the later optimizations [70, 59, 63, 49], achieve time for updates and queries. The only framework for showing polynomial lower bound was recently proposed by Abboud and Dahlgaard [4] who proved a lower bound of under the popular APSP Conjecture [84, 95, 6, 5, 7, 85, 39]. Using their framework, accomplishing Challenge 1 directly leads to a higher lower bound of , as is known in the weighted case.
History suggests that weighted planar graph metrics might be harder to work with, but they are never truly harder. In so many cases, a new algorithm for the unweighted case is followed by an almost-as-good algorithm for the weighted case, a few years later. For example, a PTAS for the Travelling Salesman Problem in the unweighted planar metric was found in 1995 [55], and then for the weighted case in 1998 [21]. Perhaps it is only a matter of time until our lower bounds for the unweighted metric match the weighted.
1.1 Our Results
Our first result is a new compression scheme for the planar graph metric, which achieves the information theoretically best possible bit complexity, up to log-factors. We give a positive resolution to Open Question 1, deem Challenge 1 to be infeasible, and show that unweighted planar graphs are inherently less complex than weighted ones; in fact, they admit a polynomially more efficient metric compression.
Theorem 1.2**.**
Given an unweighted undirected planar graph on nodes and a subset of nodes, we can return a binary encoding of length from which all pairwise distances in can be recovered exactly.
This shrinks the gap in our understanding of the planar metric compression problem from polynomial to polylogarithmic (removing this polylogarithmic gap remains an open question). For comparison, when , we show that bits are necessary and sufficient, while in the weighted case the bound is . Our encoding breaks the lower bound of Krauthgamer et al. [72] for compressions using minors, and raises the question whether it can be matched via other forms of sparsification or graphical compressions.
It is unclear whether our new compression scheme will lead to improved upper bounds for labeling, routing, or dynamic algorithms. In Section 6, we discuss the difficulty in turning it into a labelling scheme. Still, it certainly shakes our beliefs about the right bounds for those problems. Even if better upper bounds are not possible, it is no longer a mere puzzle as in Challenge 1 that is standing in the way of higher lower bounds – substantially new techniques and frameworks must be developed.
Distance Oracles.
Our first result was a mathematical advance in the understanding of the planar graph metric. Next, we use it algorithmically to achieve a new Subset Distance Oracle that could be an appealing choice in many applications.
A distance oracle is an encoding of a graph from which a pairwise distance can be queried efficiently. Since the seminal paper of Thorup and Zwick [90], a central subject of study in Graph Algorithms has been to understand the inherent tradeoff between the parameters of these distance oracles (see the survey by Sommer [87]): The size of the compression, the query time for returning a distance, the error in the answers, the preprocessing time to construct the compression, and so on.
Many exact distance oracles for planar graphs have been proposed [41, 20, 29, 45, 97, 79], mostly focusing on the tradeoff between space and query time, and giving space and query time [76]. Cohen-Addad, Dahlgaard, and Wulff-Nilsen [35] show that the technique of abstract Voronoi diagrams recently introduced into the field of planar graphs by Cabello [30] leads to an oracle with space and query time, suggesting that a better tradeoff is possible.
To get even better tradeoffs we might allow error [88, 70, 66]: we can achieve very small space and fast query time. Note that space is impossible in this setting, no matter what query time we allow. However, another natural way to get better tradeoffs is to restrict our attention to a subset of the nodes. A Subset Distance Oracle is a small space data structure that can efficiently return the distance between any pair of nodes from a set of nodes. Here, for any , e.g. , our new compression scheme suggests that a distance oracle might have space.
Subset distance oracles arise naturally. In typical applications of distance oracles, one can predict that all queries will be among a subset of nodes. Space efficiency is often a high priority. For example, if our graph is the national road network, one might be interested in a mobile app that can return the distance between any pair of bus stops.
Our second result is the first subset distance oracle with non-trivial space bounds. Notably, all previous distance oracles in the literature work equally well for weighted and unweighted graphs, while ours uses new techniques that are provably impossible for weighted graphs.
Theorem 1.3**.**
Given an unweighted undirected planar graph on nodes and a subset of nodes, there is a polynomial-time algorithm that returns a data structure of size such that given and any pair of nodes in we can return the exact distance in time.
The main open question left by our work is whether our query time can be improved, perhaps all the way down to . This would be an essentially optimal distance oracle. But even as it is, our query time is sublinear, and our space is sublinear for any , making it an appealing choice in applications with strict space constraints.
Finally, an intriguing and wide open question is to extend (any of) our upper bounds to directed planar graphs. Can we accomplish Challenge 1 if we allow directed edges? Our tools heavily rely on the graph being undirected, yet it remains unclear if a higher lower bound can be proven for directed graphs.
1.2 Technical Overview
We exhibit the first use of the Unit-Monge property to the algorithmic study of planar graphs. It is well known that distances in a planar graph enjoy this property, due to the non-crossing nature of shortest paths in the plane, but prior to our work, only the (non-unit) Monge property, was known to be algorithmically exploitable for planar graphs. For the past few decades, it has been heavily utilized in numerous algorithms for problems related to shortest paths or minimum cuts in planar graphs (e.g. [45, 29, 30, 76, 63, 59, 50, 69, 27, 77, 28, 74, 75]), and beyond, in dozens of papers on computational geometry (e.g. [63, 50, 13, 14, 15, 11, 12, 67]) and pattern matching (e.g. [86, 37, 57, 92]). Meanwhile, the stronger Unit-Monge property has only been exploited for algorithms on sequences where it has already led to several breakthroughs. We refer the reader to the 159-page monograph of Tiskin [91] for an exposition of these applications.
Recall that we want to encode the distances among nodes in a planar graph. Let us assume that we are lucky and all the nodes lie on a single face of the graph. Denote the nodes appearing on the face in order , and for simplicity assume that we only want to encode -to- distances. Let be the matrix of distances so that . This matrix has the Monge property: For any we have that . This is because the -to- shortest path and the -to- shortest path must cross. Moreover, it is Unit-Monge, that is, . This is because there is an edge between and and so the distances involving these nodes are always at most apart.
Our gains come from the fact that Unit-Monge matrices are compressible into bits! For non-unit Monge matrices, the construction of Gavoille et al. implies an lower bound. Another striking example for the extra power of the Unit-Monge property is the fact that (a compact representation of) the distance product of two such matrices can be computed in time [92], while for non-unit Monge matrices only algorithms are possible.
The main issue for us, and in general when exploiting Monge properties, is that the nodes we care about do not necessarily lie on a cycle. The simple solution is to add a cycle connecting our nodes and assign weight to the new edges so that they do not change the distances, or more formally, to triangulate the graph. After we do this, we have the Monge property, but because of the infinite weight edges, we do not have the unit-Monge property. This solution is common to all the algorithms cited above that use the Monge property, and is quite reasonable when the graph is weighted to begin with. For unweighted graphs, on the other hand, our work proves that it is too lossy and a more involved solution leads to much better results.
At a very high-level, our approach is to use a Baker-like decomposition into slices (vertices at consecutive levels of some specific BFS tree) whose boundaries are cycles, and to store distances to the slice boundaries. Observe that when we argued above that the unit Monge property holds because there is an edge between and , we did not require that there is also an edge between and . In our solution there is an edge between consecutive vertices on the boundary cycle of each slice. Therefore, even if we triangulate each slice using infinite weight edges, we can still exploit the unit Monge property when storing distances between certain vertices in a slice and the slice boundary.
The decomposition into slices is such that, after triangulation, the slices have small cycle separators. We recursively separate the vertices of the set within each slice using small cycle separators. We store distances between separators and the slice boundary (using the unit Monge property) and between vertices of and separators (using the fact that separators are small). Significant technical issues arise with the nesting structure of slices. This gives rise to so-called holes in a slice. Dealing with multiple holes requires a detailed study of additional structural properties, and a more complicated recursive solution based on these properties. In essence, we show that whenever a naïve solution does not work in the presence of multiple holes, there is one hole that can be handled efficiently using a different approach.
We believe it is very likely that other problems in unweighted planar graphs can be solved by exploiting the Unit-Monge property. Our near-optimal metric compression serves as a proof of concept that this is possible. However, technical challenges might have to be overcome in each specific application. In particular, the fast distance product algorithm for unit-Monge matrices [92] appears to be a strong and relevant technique that we are so far unable to exploit for solving problems in planar graphs.
2 Preliminaries
We assume basic familiarity with planar graphs and planar graph duality. We denote the primal graph by and the dual graph by . For a spanning tree of , we use to denote the spanning tree of . It is well known [94] that the set of edges of not in form a spanning tree of . We often refer to as the cotree of [44]. For a spanning tree of and an edge of not in , the fundamental cycle of with respect to in is the simple cycle consisting of and the unique simple path in between the endpoints of .
Given an assignment of nonnegative weights to the faces of , we say that a simple cycle is a balanced separator if the total weight of faces strictly enclosed by and the total weight of faces not enclosed by are each at most of the total weight.222It is more usual to require that the total weight is at most either or . However, in our particular application turns out to be more convenient. We often assign weights to vertices rather than to faces. Finding a balanced separator with respect to vertex weights reduces to the case of face weights (for each vertex, simply remove its weight and add it to an incident face). It is well known (see, e.g., [71]) that in triangulated planar graphs there exists a balanced separator that is a fundamental cycle assuming that no face has more than of the total weight (in fact, this is true for any planar graph such that has maximum degree 3). For vertex-weights, if no vertex has more than 1/2 of the total weight and the graph is triangulated and there are no self loops then by evenly transferring the weights to faces we obtain that no face receives more than 1/2 of the total weight (because every node is incident to at least two faces) and we can invoke the face-weights version of the balanced separator. Many planar graph algorithms triangulate the graph by adding edges to ensure that short balanced cycle separators exists. The lengths of the added edges is set to be sufficiently large so as not to change distances in the graph. This is clearly not possible in unweighted planar graphs, and is one of the obstacles we will need to overcome.
2.1 The Monge and Unit-Monge properties
One of the main tools we use for succinct representation of distances in unweighted undirected planar graphs is the unit Monge property. We next prove a sequence of lemmas that utilize this property to efficiently store distances between vertices on cycles. We begin with encoding distances between disjoint sets of vertices on a single face (Lemma 2.1), then encoding all-pair distances between the vertices on a single face (Lemma 2.2), then encoding all-pair distances between the vertices on a single simple cycle (Lemma 2.3), and finally, encoding the distances between the vertices of two faces (Lemma 2.4).
Lemma 2.1**.**
Let be the cyclic walk of a face of a planar graph partitioned into two parts and . Then, for any subset of , all distances between vertices of and vertices of can be encoded in bits.
Proof.
Let . We define an matrix such that equals the distance in between and . The matrix is Monge, that is
[TABLE]
for any and . This is because the shortest -to- and -to- paths must necessarily cross. Furthermore, the matrix is unit-Monge, that is
[TABLE]
for any and , because there is an edge . Consequently, for any , the sequence of differences is nondecreasing and contains only values from , so can be encoded by storing the positions of the first 0 and the first 1. Storing these positions for every takes bits. To encode the whole , we additionally store for every using bits. ∎
Lemma 2.2**.**
Let be the cyclic walk of a face of a planar graph. Then, all distances between vertices of can be encoded in bits.
Proof.
We recursively encode all distances between vertices from a contiguous fragment of using Lemma 2.1. We start with the whole . To encode the distances between all vertices , where , we set and proceed as follows:
Recursively encode the distances between all vertices . 2. 2.
Recursively encode the distances between all vertices . 3. 3.
Apply Lemma 2.1 with and .
The total size of the encoding is described by the recurrence , hence solves to . ∎
Lemma 2.3**.**
Let be a simple cycle in a planar graph. Then, all distances between vertices of can be encoded in bits.
Proof.
Consider the two planar graphs () obtained by removing all vertices enclosed (not enclosed) by . is the cyclic walk of a face in and , hence we can apply Lemma 2.2 to store the distances in and in between vertices of . This is enough to encode the distances in between vertices of , as any such shortest path can be partitioned into shortest paths between two vertices of such that each of these paths exists in either or . ∎
Lemma 2.4**.**
Let and be the cyclic walk of two faces of a planar graph. Then, all distances between a prefix of and any subset of can be encoded in bits.
Proof.
We first choose a shortest path between and and let and be its endpoints. We make an incision along and apply Lemma 2.1 to encode the distances between and corresponding to shortest paths that do not cross using bits of space. It remains to encode distances corresponding to shortest paths that do cross . Without loss of generality connects and . We orient and so that after making an incision along the vertices and are adjacent to the endpoints of .
Consider a shortest path from to crossing , see Figure 1. Because both and are shortest paths, can be assumed to cross exactly once. Similarly, a shortest path from to crossing can be assumed to cross exactly once. We claim that must cross . Otherwise, by considering an incision along we can conclude that crosses an even number of times but this is a contradiction. Therefore, any such and must cross. This means that the matrix , where is set to be the length of a shortest path between and crossing once, is Monge. That is,
[TABLE]
Additionally, because is an edge. We can hence apply the reasoning from Lemma 2.1 to encode using bits. ∎
3 The Encoding
Our encoding is based on decomposing the input graph into slices. To define the slices, recall the face-vertex incidence graph of a planar graph : It has a vertex for every vertex of and a vertex for every face of , and if a vertex of is incident to a face of then there is an edge between their corresponding vertices in .
We run a breadth-first search in , starting from the node representing the infinite face of . After every even number of steps, the yet unexplored part of the graph can be decomposed into a number of connected components, the boundary of each being a simple cycle. More formally, we assume that the infinite face of is a triangle by enclosing the whole graph in a triangle, which is connected to one of the original vertices with a single edge. We define the level of a face or a vertex of to be its depth in the BFS tree of . Thus, e.g., the level of the infinite face of is zero, and the level of the vertices incident to the infinite face of is 1. For each even integer , we define to be the set of connected components of the subgraph of induced by the faces with level at least . We call each component a level- component. We use a tree called the component tree of to capture the nesting of level components. The nodes of are the level components of . A level component is an ancestor of a level component in if the set of faces in contains the set of faces in . Since we assume that the infinite face of is a simple triangle, is indeed a tree whose root is the component corresponding to the set of all faces of except for the infinite face.
The boundary of a component is the set of edges that are incident to a face in and to a face not in . It is not difficult to see that the boundary of each component is a simple cycle in , and that the boundaries of different components are edge-disjoint. See [71, 68] for these and other properties of components and the component tree. For a node , we associate with the boundary cycle of the level component represented by , and define the cost of denoted to be the length of . For example, for the root of we have that is a triangle and that .
Lemma 3.1**.**
For any , there exists such that the total cost of all nodes of at depth is .
Proof.
because cycles corresponding to the nodes of are pairwise edge disjoint. Let consist of all nodes of at depth . Then for and , so there exists such that as claimed. ∎
To define the slices we apply Lemma 3.1 and call the nodes of at depth marked. The root of is also marked. Then, for every marked node , the slice of is the subgraph of enclosed by and not strictly enclosed by for any marked descendent of . The embedding of slices is inherited from the embedding of . Thus, the boundary of the infinite face of the slice of is . The cycle is also called the boundary of the slice . Each cycle corresponding to a marked descendant of such that there are no other marked nodes on the -to- path becomes a face in the slice . Such a face is called a hole of , and is called the boundary of the hole. Note that, by definition, is the boundary of the level component that is embedded in the hole . Because the total cost of all marked nodes is and the cost of the root is 3, the total size of all boundaries in all slices is . Additionally, by construction, for any slice , a breadth-first search of , the face-vertex incidence graph of , starting at the infinite face of , terminates after iterations and every hole is a leaf in the obtained breadth-first search tree.
By definition of slices, each slice contains faces and vertices at consecutive levels. We would like to use in our solution short (i.e., ) fundamental cycle separators within each slice. However, the diameter of a slice is not necessarily because face sizes may be large. To deal with this issue we triangulate the faces so that a BFS tree of a slice will have depth , and will be consistent with the BFS tree of .
Lemma 3.2**.**
We can triangulate all faces of a slice so that a BFS starting from the external face produces a spanning tree with the property that vertex is the parent of vertex in if and only if is the grandparent of in the BFS tree of .
Proof.
Let be the BFS tree of . If and are incident to the same face in , and is a grandparent of , and is not an edge in , we add as an artificial triangulation edge to . Adding these edges can be done consistently with the embedding of because the path in can be embedded on the same plane as such that and only intersect at vertices of . See Figure 2. We introduce an artificial vertex embedded in the infinite face of and triangulate the infinite face of by adding edges between and every vertex of the infinite face of . Similarly, we triangulate each hole of by introducing an artificial vertex , embedded in , and adding edges between and every vertex on the boundary of . Any remaining non-triangulated faces can be triangulated arbitrarily. Since for every grandparent to grandchild path in there is a corresponding edge in the triangulation of , there exists a BFS tree rooted at the artificial vertex that satisfies the statement of the lemma. Note that all the artificial vertices embedded in holes of are leaves of , and hence satisfy the statement of the lemma vacuously. ∎
Let be the graph obtained from the slice after applying Lemma 3.2. Let be the BFS tree of . Note that, any fundamental cycle w.r.t. consists of two paths in , each consisting of vertices due to the triangulation. However, may use edges that are not original edges of (i.e., artificial triangulation edges). We do not want to consider such edges when dealing with distances, because distances in differ from distances in . To this end we use the notion of a Jordan curve. A Jordan curve in is an embedded curve that intersects the embedding of only at vertices of . Since the embedding of the triangulation is consistent with that of , each path in is a Jordan curve in . We say that is a Jordan tree in . In particular, any fundamental cycle w.r.t. is a Jordan cycle (closed Jordan curve) in . We next describe how the tree can be used to recursively decompose into subgraphs called regions.
A region is a subgraph of . The boundary of is defined as the set of vertices of that are incident (in ) to both an edge in and to an edge not in . Thus, for example, the boundary of the region consisting of the entire slice consists of the external boundary of and of the boundaries of all the holes of . Let be a region. Let be a fundamental cycle w.r.t. . The tree may contain edges that are not edges of (either because they are triangulation edges, or because they are edges of that do not belong to the region ). Since the embedding of is consistent with the embedding of any subgraph of , is a Jordan cycle in . The operation of separating using yields two subgraphs. One is the subgraph induced by the faces of strictly enclosed by and the other is the subgraph induced by the faces of not strictly enclosed by . This view of as a Jordan tree in any region allows us to reuse the same tree throughout the recursive decomposition.
This recursive process can be described by a binary tree . Each node of corresponds to a region (subgraph) of . The root of is the entire slice . Each non-leaf node of is associated with a (Jordan) fundamental cycle separator of , which we denote . The regions of the two children of are the regions obtained by separating with the Jordan cycle .
3.1 The simplified case of a single hole
We begin with the simplified case, in which we assume that each slice has a single hole. This is the case, for example, when the input planar graph is a grid (with possibly subdivided edges).
First we use Lemmas 2.3 and 2.4 to store, for each slice with external boundary , and a single hole with external boundary the following distances. The boundary-to-boundary distances: the distances (in ) among the vertices of , and the hole-to-boundary distances: the distances (in ) between the vertices of and the vertices of .
Boundary-to-boundary and hole-to-boundary distances encode distances “between slices”. We also need to encode distances “within slices”. We will use the fact that has a spanning tree of depth to decompose into regions, each containing a single distinguished node (i.e., node of ), and having a boundary that consists of vertices. Then we can afford to store, for each distinguished node, its distance to the boundary of its region, and, using the unit-Monge property, to also store the distances between the vertices on the boundary of each region to the vertices of and (i.e., the boundary of ) that belong to . These distances will suffice for reconstructing the distance between any pair of distinguished nodes.
Let denote the set of distinguished vertices in slice . We use fundamental (Jordan) cycle separators w.r.t. the tree to recursively divide into regions, until each region contains a single distinguished vertex. At each recursive step we separate a region into two subregions by choosing a fundamental cycle separator w.r.t. that balances the number of distinguished vertices in (i.e., assigning unit weight to each distinguished vertex in and zero weight to all other vertices). Note that, since we use balanced separators, the depth of the recursion tree is . Recall that the fundamental cycle separators w.r.t. do not cross each other, and, by construction of in Lemma 3.2, each fundamental cycle separator crosses each of the external boundary of and the hole of at most twice. Therefore, the boundary of each region corresponding to a node in the recursive decomposition tree contains vertex disjoint maximal subpaths of , and vertex disjoint maximal subpaths of .
At the step of the recursive decomposition corresponding to node with separator and two children , we store -to-separator distances: explicitly store the distances (in ) between every vertex of in and every vertex of , separator-to-boundary distances and separator-to-hole distances: for , the distances (in ) between every vertex of and every maximal subpath of or on the boundary of , using Lemma 2.1 or Lemma 2.4 (depending whether they lie on a single or two faces of ). Finally, for every leaf , we store -to-boundary distances and -to-hole distances: the distance between the unique distinguished vertex in to every vertex of or on the boundary of .
Analysis.
We first show that the total space is , and then show that the distances between any pair of vertices in can be recovered using just the information we stored. Since the total size of all slice boundaries is , storing the boundary-to-boundary distances and the hole-to-boundary distances takes using Lemmas 2.3 and 2.4. Since the depth of is , each vertex of belongs to regions in the decomposition of . Since, in addition, for every , the total space required for storing the -to-separator distances is . Consider a region of a slice . Recall that the vertices of () that belong to lie on vertex disjoint maximal subpaths of (). The endpoints of each such maximal subpath may belong to another region at the same depth in . Therefore, shares vertices of () with other regions at the same depth in . Finally, recall that the number of regions of is . Therefore, using Lemma 2.1 or Lemma 2.4, the total space for storing the separator-to-boundary and separator-to-hole distances is . In more detail, let be the total number of slice/hole boundary vertices in the -th slice. Then, in every slice every boundary/hole vertex that is not an endpoint of a maximal subpath contributes at most once at each level of recursion. At each level, we have at most recursive calls, so at most maximal subpaths and at most fundamental cycle separators. Therefore, the total space is . Storing -to-boundary distances and -to-hole distances at the leaves of the recursion tree requires total bits since each boundary or hole vertex belongs to exactly one leaf region, except for vertices (endpoints of maximal subpaths). Choosing proves the space bound.
Finally, we prove that the distances between any pair of vertices in can be recovered using just the information we stored. For any , if a shortest -to- path does not leave then , and the distance can be obtained using the -to-separator distances stored in the lowest common ancestor of the regions of and in . Otherwise, let be a shortest path between vertices and (where is either or, wlog, enclosed by the hole of ). Let denote the subpath of between vertices and . Let be the first vertex of that belongs to the boundary of or to the boundary of a hole of . If contains some vertex of a fundamental cycle separator used in processing , let be the last vertex of that belongs to the earliest such separator. If does not exist, then the length of is stored as an -to-boundary or an -to-hole distance. If exists then the length of is stored as a -to-separator distance, and the length of is stored as a separator-to-boundary or separator-to-hole distance. Let be the last vertex of that belongs to the boundary of . The length of can be computed from boundary-to-boundary and hole-to-boundary distances since can be decomposed into subpaths between boundary vertices of slices. The length of the suffix can be computed in a similar manner to that of the prefix .
3.2 The general case
A difficulty that arises in the presence of multiple holes is that since the number of holes is not bounded by a constant, we cannot afford to store distances involving holes. For example, storing hole-to-boundary distances between the external boundary of a slice and the boundary of each hole of requires bits per hole. Since the number of holes can be , the total space could be .
The role of storing distances involving boundaries of holes was to allow the recovery of distances to distinguished vertices enclosed in these holes. We modify our approach for processing a slice to take into account the distinguished vertices enclosed in holes of as well as the distinguished vertices in itself. As in the single hole case, the slice will be recursively divided using fundamental cycle separators. For any region encountered along the recursive process, let denote the subset of the distinguished vertices in , as well as those enclosed by any hole in . Thus, for example, is the set of all vertices in that are enclosed (in ) by the external boundary of slice . We say that a Jordan cycle separator of a region is good if it is balanced w.r.t. and does not go through any hole of . The problem with Jordan separators that go through some hole is that they partition the distinguished vertices enclosed by in an unspecified way since these distinguished vertices are not represented in . It is not hard to see that if a good separator always exists then we do not need to store any distances involving holes.
In reality we cannot always find a good separator. Consider, for example, the case where some hole of a region encloses most of the vertices of . Clearly, a separator that is balanced w.r.t. must go through . Thus, there is no good separator in such a case. We show, however, that we can always either find a good separator, or there exists some hole (which we call a disposable hole) that can be dealt with in a special way. This is reminiscent of recursive procedures based on heavy path decomposition, where heavy nodes (disposable holes in our case) are treated differently than light ones. We guarantee that, in either case, each resulting subregion contains only a constant fraction of , so the depth of the recursion is . We next explain the details.
Good separators and disposable holes.
Let be a region. We define the weight of each vertex of to be if is a distinguished vertex. For each hole of , we define the weight of the artificial vertex embedded in to be the number of distinguished vertices strictly enclosed (in the whole graph ) by the boundary of . All other vertices are assigned weight zero.
Recall that a cycle separator is good if it does not go through any hole. We would like to separate using a good fundamental cycle separator of some edge w.r.t. . If we can find such separator where is not incident to for some hole , then is a good separator (since the vertices are leaves of the spanning tree ). Otherwise, we must separate with a fundamental cycle separator that goes through holes. We next define disposable holes, and then show that we can allow the fundamental cycles to go through such holes.
Let be a node (level component) in . Let be the boundary cycle of . Let be an edge of . Note that . This is because both endpoints have the same level, so, by Lemma 3.2, neither can be the parent of the other in . Let be the endpoints of in the dual graph, such that is a face in and a face not in . Since , is in the cotree . Consider breaking into two subtrees by deleting . We say that the edge is light if the subtree of that contains has weight at most where is the total weight of the vertices of . Note that we defined weights of primal vertices, whereas the vertices of are primal faces. To define face weights, evenly redistribute the weight of each vertex among all of its incident faces. There is an equivalent, primal view of light edges: The Jordan cycle partitions into two subgraphs, exactly one of which contains the faces of the level component corresponding to . We say is light if the weight of the subgraph that does not contain the level component is at most half the weight of . We say that a level component is disposable in region if there are boundary edges of in , and if every edge of the boundary of that is also in is light. Note that, in particular, this definition applies to holes (since holes are level components). See Figure 3.
Before showing why disposable holes exist and that they are useful, we first mention a simple property of and then use it to prove the existence of disposable holes.
Property 3.3**.**
The cotree enters each level component exactly once.
Proof.
The spanning tree is monotone with respect to node levels. Thus, if is an edge of the boundary of a level component , then one of the components of contains no other faces, vertices or edges of . See Figure 4 for an illustration. ∎
Lemma 3.4**.**
If a region contains more than one vertex with non-zero weight, then there exists either a good balanced fundamental cycle separator or a disposable hole in .
Proof.
Let be the total weight of vertices in . Consider the component tree . Let be a deepest disposable component in such that has an edge in . If is a hole of then we found a disposable hole, and we are done. Otherwise, we next show that there exists a good separator.
Let be the children of in (if there is no disposable component in , then define to be , to be the external boundary of , and let be the set of rootmost components in such that has an edge in ). Since none of the ’s is disposable, for each there exists exactly one boundary edge (here we wrote as a dual edge, and is the endpoint of that belongs to ), such that the subtree of that contains has weight at least . Consider the following two phase process (see Figure 5 for an illustration): Let . If contains more than a single face of some (in which case it must contain all faces of by Property 3.3), then is obtained from by rooting at and deleting all the strict descendants of in , so that becomes a leaf. The weight assigned to in is the total weight of all the vertices in the deleted subtree. Thus, the weight of remains , and, by definition of , the weight of is at most . The first phase terminates when contains at most one face () from each . In the second phase, while contains an edge of that is not a leaf edge of , then is obtained from by rooting at the endpoint of that belongs to , and deleting all the strict descendants of the other endpoint of in , so that becomes a leaf. Similarly to the first phase, the weight of in is set to the total weight of all the vertices in the deleted subtree. Since is disposable, the weight of is at most .
Let be the resulting tree. Since contains at most one face from each , contains no triangulation edges of a hole (both endpoints of a triangulation edge of a hole belong to the hole). Furthermore, the total weight of is , and every leaf of created during the two phase process has weight at most (by definition). For the remaining nodes of , the degree is at most 3 and the weight is also at most , because the original weights in are at most (otherwise, the node corresponds to a hole of weight at least that is, by definition, disposable, and we are done). Therefore, there exists an edge whose deletion from results in two trees, none of which weighs more than . By construction of the weights of , the balance of the fundamental cycle of w.r.t. is exactly the ratio of the weights of the subtrees obtained by deleting from . Therefore, the fundamental cycle of w.r.t. is a balanced Jordan cycle separator. Since no edge of is a triangulation edge of a hole, is a good separator. ∎
With this structural lemma we can now describe our oracle. Consider a slice and let be the subgraph of enclosed by the boundary of . The goal of processing slice is to store information (distances) so that the following distances (in ) can be recovered from the information stored for all slices contained in .
The distance between any two distinguished nodes in , 2. 2.
The distance between any distinguished node in and any vertex on the boundary of . 3. 3.
The distance between any two vertices on the boundary of .
Encoding this information for all slices guarantees that distances between the distinguished vertices in the whole graph are captured.
The encoding.
To process a slice , we first encode boundary-to-boundary distances: the distances (in ) between vertices on the boundary of using Lemma 2.3. We then triangulate and define its spanning tree using Lemma 3.2.
Next, we recursively separate using fundamental cycle separators. The initial region is the entire slice . Its boundary is the external boundary of . A region is separated into subregions obtained by cutting along some fundamental cycle separator w.r.t. . Since we only use fundamental cycle separators w.r.t. the same tree , the separators never cross. Hence, the boundary of each new region consists of the contiguous portion of that belongs to , and possibly portions of the boundary of . Since crosses at most twice (at most once for each of the two paths in the fundamental cycle ), the number of contiguous maximal fragments of in the boundary of is at most one plus the number of such fragments in the boundary of . Consequently, the number of contiguous maximal fragments of in the boundary of any region is bounded by the depth of the recursion, which we will show is .
We now explain how to choose the fundamental cycle separator with which we separate . This is achieved using two interleaving recursive processes. We refer to the first one as the outer recursion, and to the second one as the hole elimination recursion. In a step of the outer recursion we apply Lemma 3.4.
If we find a good balanced fundamental cycle separator , then we use it to separate the region . Every vertex in explicitly stores -to-separator distances: its distance (in ) to every vertex of . In addition, for each subregion , for each contiguous maximal fragment of in , we encode separator-to-boundary distances: the distances (in ) between and using Lemma 2.1 or Lemma 2.4 (depending on whether the vertices of the separator and the vertices of lie on a single or two faces of ). Then, we call the outer recursion recursively for each subregion . The outer recursion terminates when there is at most one vertex with positive weight in the current region . If the only remaining object is an artificial vertex , we apply Lemma 2.4 to encode hole-to-boundary distances: the distances (in ) between the boundary of and , for each contiguous maximal fragment of in . If the only remaining object is a distinguished vertex , we store -to-boundary distances: the distances (in ) from to every vertex of every . If the current region contains no vertices with positive weight, the outer recursion terminates.
If, on the other hand, we found a disposable hole , we store hole-to-boundary distances: distances between the boundary of and every contiguous maximal fragment of in . The weight of the artificial vertex is set to zero. This reflects the fact that for the rest of the processing of , distinguished vertices enclosed by the hole will not be treated individually and directly, but rather by encoding distances involving the vertices of . From this point on, vertices of inside are no longer considered vertices of . We then call the hole elimination process for the hole in region (see Figure 6). In a single step of the hole elimination recursion, a region is separated using a fundamental cycle separator w.r.t. that is balanced w.r.t. the number of vertices of in (i.e., a weight 1 is assigned to each vertex of and 0 to all other vertices). Note that is necessarily a fundamental cycle w.r.t. of some triangulation edge that is incident to . The boundary of each of the two resulting regions contains a single contiguous portion of consisting of roughly half the vertices of in . Similarly to the single hole case, we store -to-separator distances: distances (in ) from every vertex of to every vertex of . For each subregion obtained by separating along , for each contiguous fragment of in , we encode separator-to-boundary distances: the distances (in ) between and using Lemma 2.1 or Lemma 2.4, and separator-to-hole distances: the distances (in ) between and the single contiguous fragment of that belongs to using Lemma 2.1. We then apply the hole elimination process recursively to each subregion . It terminates when the current region contains at most two consecutive vertices of , or when it contains at most one distinguished vertices. When this happens, we continue with the outer recursion on .
We next prove that the total depth of the entire recursive procedure is .
Analyzing the recursion depth.
We begin with the initial region being the entire slice. In a single step of the outer recursion, if we find a good separator then we use it to separate the current region thus decreasing the weight of each resulting region by a constant factor. If however we do not find a good separator, then we apply the hole elimination process on a disposable hole of the current region . Since , and since every recursive call to the hole elimination process decreases the number of nodes of by half, we get that after recursive calls the hole elimination process terminates, with each resulting region containing only two nodes of . Observe that these two nodes must be adjacent on (see Figure 6). Let be the edge between them and let be the fundamental cycle of w.r.t. . Since is disposable, the weight of the region obtained by separating using is at most half the weight of region . Since and since the weight of is zero this means that the weight of is at most half the weight of . We conclude that every consecutive recursive calls the total weight of a region decreases by a constant factor. This shows that the depth of the recursion is .
Correctness.
We next prove that the distance between any two distinguished vertices in can be recovered from our encoding.
Lemma 3.5**.**
The length of a shortest path in from any to any can be recovered from the encoding.
Proof.
If contains some vertex of a fundamental cycle separator used in processing , let be the last vertex of that belongs to the earliest such separator. By choice of the earliest separator, the -to- distance (in ) is stored (-to-separator distance). By choice of the last vertex on that belongs to that separator, the -to- distance (in the region of that contains ) is stored (separator-to-boundary distance). Thus, the length of can be recovered.
If contains no such vertex, then and are in the same region when the recursion terminates, so the -to- distance (in ) is stored as a -to-boundary distance. ∎
We extend the previous lemma and show that it applies also to distinguished vertices enclosed by holes of (i.e., for instead of ).
Lemma 3.6**.**
The length of a shortest path in from any to any can be recovered from the encoding.
Proof.
The proof is by induction on the nesting depth of slice . The base case follows from Lemma 3.5. For the inductive step, if we are done by Lemma 3.5, so assume is enclosed by some hole of .
If contains some vertex of a fundamental cycle separator used in processing before hole is eliminated, let be the last vertex of that belongs to the earliest such separator. By choice of the earliest separator, the -to- distance (in ) is stored (-to-separator distance), and by choice of the last vertex of that separator on , the -to- distance (in a region of that contains ) is stored (separator-to-boundary distance). Thus, the length of can be recovered.
If contains no such vertex, then the artificial vertex and are in the same region when either the recursion terminates, or the hole is eliminated. In either case, the -to- distances are stored (hole-to-boundary distance). Decompose into a maximal prefix enclosed by the slice whose boundary is , a maximal suffix enclosed by , and an infix . The length of the prefix is stored by the inductive hypothesis for . The length of the infix is represented by the boundary-to-boundary distances for . The length of the suffix is stored (hole-to-boundary distance). ∎
Finally, we extend the previous lemma and show that it applies to any two distinguished vertices.
Lemma 3.7**.**
The length of a shortest path in from any to any can be recovered from the encoding.
Proof.
Assume, wlog, that both and are enclosed in holes of (the other cases are similar and less general). If contains some vertex of a fundamental cycle separator used in processing before either hole is eliminated, then let be a vertex of that belongs to the earliest such separator. By choice of the earliest separator, both the -to- and the -to- distances (in ) are stored (-to-separator distance). Thus, the length of can be recovered.
Otherwise, the hole of and the hole of are in the same region when one of them, say the hole of , is eliminated. If intersects one of the fundamental cycle separators used during the elimination process of hole , then let be the last vertex on the earliest such separator (see Figure 7). By choice of earliest separator, the -to- distance is stored (-to-separator distance). By Lemma 3.6, the length of the maximal suffix enclosed in is also stored. Let be the first vertex of that belongs to either or ( exists because is enclosed by and is not).
- •
If belongs to then the length of is stored (separator-to-boundary distance). In this case, let be the last vertex of that belongs to . The length of is represented by boundary-to-boundary distances for . Let be the first vertex of that belongs to . The length of is represented as a hole-to-boundary distance (when is eliminated). Let be the last vertex of that belongs to . The length of is represented by boundary-to-boundary distances for , and the length of is represented by Lemma 3.6. See Figure 7 for an illustration.
- •
If belongs to then the length of is represented as a separator-to-hole distance. The representation of the suffix is then similar to the previous case.
Finally, we need to treat the case where does not intersect any fundamental cycle used in eliminating the hole . In this case can be decomposed into a -to- prefix, a -to- suffix, and subpaths of between vertices of . The prefix and suffix are represented by Lemma 3.6. The other subpaths are represented as hole-to-boundary or boundary-to-boundary distances as in the two cases above. ∎
Finally, we now show that the entire encoding requires only bits.
The encoding size.
The space required for the boundary-to-boundary distances for all slices is since the total boundary size is , and by Lemma 2.3.
We next bound the total space required for -to-separator distances for all slices. Whenever a distinguished vertex stores its distances to a path explicitly, the total weight of its region decreases by a constant factor within recursive steps (either immediately, if this happens in the outer recursion, or otherwise by the time the hole-elimination process ends). So this can happen times per distinguished vertex. Because (by the height of ), this sums up to a total of bits.
The analysis of the remaining distances is done for each slice separately. We have already argued that the depth of the recursive process to handle a slice is . Similarly to the analysis in Section 3.1 of the single hole case, the total space required for storing separator-to-boundary distances using Lemma 2.1 or Lemma 2.4 at all calls at the same recursive level is . For exactly the same reasons, the total space required for storing separator-to-hole distances using Lemma 2.1 at all calls at the same recursive level (this only happens in the hole-elimination recursion) is .
Hole-to-boundary distances are stored using Lemma 2.4 for at most one hole in each region along the recursion. Each invocation of Lemma 2.4 for hole and boundary fragment requires bits. For a single level of the recursion, this sums up to because the total size of all boundaries is and there are vertices that contribute in more than one region (endpoints of ’s). The bound for -to-boundary distances is for the same reasons.
To conclude, we showed that the total size of the entire encoding is bounded by , which is by choosing .
4 A Tight Lower Bound
Recall that Gavoille et al. [48] show how to construct, given a Boolean matrix , a planar grid containing vertices, such that can be recovered from the distances between distinguished vertices of . This shows that, for , encoding all distances between vertices of a planar graph requires bits. For , we consider Boolean matrices . For each of these matrices, we construct a planar grid containing vertices. The disjoint union of all these grids is a planar graph on vertices, such that all Boolean matrices can be recovered from the distances between the distinguished vertices. Hence, encoding all such distances requires bits. Setting we obtain that encoding all distances between the distinguished vertices of a planar graph on vertices requires bits.
5 Query Time
The goal of Section 3 was to guarantee that all distances between distinguished vertices are captured, but we were not concerned with the complexity of retrieving such a distance. In this section we explain how to augment the encoding to allow efficient extraction of the stored distances.
We start with reformulating our encoding using the notion of dense distance graphs. Vertices of a dense distance graph are listed explicitly, but its edges are described implicitly with unit Monge matrices. Each such matrix describes lengths of the edges between every and , for some subsets of nodes and . The matrix is represented using bits as described in Lemma 2.1. In particular, we may have and then the matrix simply stores the length of a single edge explicitly. The size of a dense distance graph is the total number of vertices plus the sum of over all matrices describing length of the edges. By construction, our encoding described in Section 3 is based on defining a dense distance graph of size . Every distinguished node of the original graph is a vertex of the dense distance graph, and the distance between two distinguished nodes of the original (unweighted) graph is the same as the distance between their corresponding vertices in the (weighted) dene distance graph. Fakcharoenphol and Rao designed an efficient algorithm for computing the shortest paths in such a graph, nicknames the FR-Dijkstra: 333In the original paper and most of the subsequent work, the dense distance graph is obtained from an -division of a planar graph. The vertices are the boundary nodes and distances between boundary nodes in the same region are represented with multiple Monge matrices. However, it is easy to see that their algorithm work for any dense distance graph as defined above.
Lemma 5.1** ([45]).**
Distance between any two vertices of a dense distance graph of size can be found in time.
Applying Lemma 5.1 gives us an oracle of size answering queries in time. For very large , say , the query time is clearly not optimal, as there exists an oracle of size answering queries in [45] time. In the remaining part of this section we will describe how to construct an oracle of size answering queries in time.
To improve the query time, we apply the vanilla planar separator lemma.
Lemma 5.2**.**
For any planar graph on nodes, there exists a partition of the nodes of into sets , , and , such that , , and there are no edges between the nodes of and .
We recursively apply Lemma 5.2 to construct a hierarchical decomposition of the whole graph. The recursion is described by a binary tree , where every node corresponds to an induced subgraph of the original graph. We let and denote the number of nodes and distinguished nodes in , respectively. We terminate the recursion as soon as . If is a leaf, we define its set of distinguished nodes to consists of all the distinguished nodes of . Otherwise, consists of the following nodes:
the separator of , 2. 2.
for every child of that is a leaf, all the distinguished nodes of , 3. 3.
for every child of that is not a leaf, the separator of .
Then, we construct a dense distance graph of size capturing distances between any two nodes from in .
To calculate the distance between two distinguished nodes and in , we locate the deepest nodes and of , such that and . Then, we consider the union of all dense distance graphs constructed for the nodes of on the paths from and to the root. Note that the same node of might appear in more than one of these dense distance graphs, and we identify all of its copies. By construction, the obtained dense distance graph captures the sought distance. Furthermore, its size is bounded by
[TABLE]
Therefore, by Lemma 5.1 we can answer a query in time. It remains to bound the size of the resulting oracle.
Lemma 5.3**.**
The dense distance graph constructed for node is of size .
Proof.
To prove the lemma it is enough to bound by . If is a leaf, this is clear. Otherwise, and consists of the following nodes:
the separator of of size . 2. 2.
for every child of that is a leaf, all distinguished nodes of , 3. 3.
for every child of that is not a leaf, the separator of of size .
Node has at most two children, so indeed . ∎
To upper bound the size of the oracle, we need to upper bound the sum . To this end, we separately consider all nodes such that , for every . Fix and call these nodes . Then, no is a descendant of another , so every node of the original graph appears in at most one . Therefore, and . From the latter inequality and the lower bound on we obtain that . Now we want to upper bound the following sum:
[TABLE]
From the concavity of , the above sum is maximized when , so we obtain:
[TABLE]
To obtain an upper bound on , we only need to multiply the above bound by because for every there exists such that belongs to the appropriate interval, so the total size of the oracle is .
6 Labeling Schemes for Unit-Monge Matrices
A distance labeling scheme is a way to compress graphs that allows for distributed decoding. The goal is to assign a label for each node , so that by looking at the labels of two nodes (without access to the original graph) we can infer the distance between them . The main question one asks about such schemes is how small can the labels be? A famous open question is to close the gap between the upper bound [48, 51] and the [48] lower bound for planar graphs. The only known technique capable of proving a tight lower bound is via a lower bound for the metric compression problem: if you show that the metric cannot be compressed into bits, then you show that no labels of size are possible. Our work deems this approach impassable, since such compressions are indeed possible. Optimistically, it is natural to ask if our upper bound for compression could lead to a better upper bound for labeling. Our encoding assigns bits per node, but can we distribute these bits to the nodes while allowing any pair of nodes to deduce the distance from their local information? In this section, we discuss why this seems difficult.
The heart of our encoding is Lemma 2.1, which is repeatedly used to capture pairwise distances between a large subset of nodes of the graph using space proportional to the size of the subset. A key part in the proof of the lemma is an efficient encoding of an matrix into bits, as long as it has the unit-Monge property, that is:
[TABLE]
and every is an non-negative integer not exceeding . The corresponding labeling problem would be to assign a label to every row and every column of , such that can be computed from the label of the -th row and the -th column. We will show that the bits of the encoding cannot be distributed into bits per label. In any such labeling scheme, some labels must be of length . For completeness, we will also provide a matching upper bound of .
We start with recalling the following connection between unit-Monge matrices and permutation matrices. is a permutation matrix if every row and every column contains at most one 1 and 0s elsewhere. Then, it is straightforward to verify that, for any permutation matrix the matrix defined as is a unit-Monge matrix. In fact, essentially any unit-Monge matrix can be obtained through such transformation. This is known, see e.g. Section 2 in [91], but we provide a proof for completeness.
Lemma 6.1**.**
For any unit-Monge matrix , there exists a permutation matrix , such that
[TABLE]
where and are vectors of length with non-negative entries bounded by .
Proof.
We define an matrix as follows:
[TABLE]
By Monge, clearly , and by unit . In fact, unit also implies that the sum in every row and every column of is at most 2. To see this for rows, consider . After telescoping, this is , so by unit at most 2 as claimed. Then, consider . After substituting the definition of and telescoping, this becomes . Hence, if we define and it holds that . Finally, we create an matrix , where every block corresponds to a single , that is, the sum of values in the block is equal to . It is always possible to define so that it is a permutation matrix. To see this, consider a row of . The values there sum up to at most 2, say for some . Then, should correspond to a 1 in the first row of its block and to a 1 in the second row of its block. If then in the corresponding block we create two 1s, one per row. Columns are chosen with a symmetric reasoning. ∎
Due to the above lemma, we can focus on assigning a label to every row and column of , such that given the label of the -th row and the -th column we can compute . We call this problem labeling unit-Monge matrices for dominance sum queries.
Lemma 6.2**.**
Labeling unit-Monge matrices for dominance sum queries can be done with bits.
Proof.
We can assume that there is exactly one 1 in every row and column of . Therefore, the input is fully described by a permutation . Any permutation on elements can be decomposed by up to increasing subsequences and up to decreasing subsequences . The label of every row and every column consists of bits stored for every such subsequence, thus bits in total. We think of every subsequence as a set of points and the bits corresponding to this subsequence in the label of the -th row and the -th column should be enough to determine the number of points such that and . We separately describe what should be stored for an increasing subsequence and then for a decreasing subsequence.
Consider an increasing subsequence consisting of points , such that and for every . Then, the label of the -th row stores the smallest such that , and similarly the label of the -th row stores the smallest such that . By taking the maximum of these two numbers we can determine the number of points such that and .
Now consider a decreasing subsequence consisting of points , such that and for every . Then, the label of the -th row stores the smallest such that . The label of the -th row stores the largest such that . Denoting the number stored for the -th row and the -th row by and , respectively, the number of points such that and can be calculated as . ∎
Lemma 6.3**.**
Labeling unit-Monge matrices for dominance sum queries requires bits.
Proof.
We conceptually divide an matrix into blocks of size , thus creating an matrix , where every entry corresponds to a block of . For every block we choose one bit . We will show that then it is always possible to construct the matrix , such that all bits can be retrieved from the labels of rows of the form and columns of the form . Then it follows that we can encode bits of information in labels, hence one of these labels must consist of bits. It remains to construct .
We construct incrementally. We call a row or a column of active if there is no 1 there. We start with an empty and keep adding 1s there while making ensure that there is at most one 1 in every row and column. Given the labels of all rows and all columns of the form we can count 1s in every block of . The goal is to ensure that this count is equal to . Assume that this is already the case for every such that or and and consider . If we continue. Otherwise, we have to choose exactly one active row in the range and exactly one active column in the range , and set , thus making both and inactive. This clearly guarantees that there is exactly one 1 in the corresponding block of . The only problem is to guarantee that there is at least one active row and column in the appropriate ranges. However, we have deactivated less than rows in the range so far, and similarly less than columns in the range , so indeed there is at least one active row and column that we can use. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Abboud and G. Bodwin. The 4/3 additive spanner exponent is tight. In 48th STOC , pages 351–361, 2016.
- 2[2] A. Abboud and G. Bodwin. Error amplification for pairwise spanner lower bounds. In 27th SODA , pages 841–854, 2016.
- 3[3] A. Abboud, G. Bodwin, and S. Pettie. A hierarchy of lower bounds for sublinear additive spanners. In 28th SODA , pages 568–576, 2017.
- 4[4] A. Abboud and S. Dahlgaard. Popular conjectures as a barrier for dynamic planar graph algorithms. In 57th FOCS , pages 477–486, 2016.
- 5[5] A. Abboud, F. Grandoni, and V. V. Williams. Subcubic equivalences between graph centrality problems, APSP and diameter. In 26th SODA , pages 1681–1697, 2015.
- 6[6] A. Abboud and V. V. Williams. Popular conjectures imply strong lower bounds for dynamic problems. In 55th FOCS , pages 434–443, 2014.
- 7[7] A. Abboud, V. V. Williams, and H. Yu. Matching triangles and basing hardness on an extremely popular conjecture. In 47th STOC , pages 41–50, 2015.
- 8[8] I. Abraham, S. Chechik, and C. Gavoille. Fully dynamic approximate distance oracles for planar graphs via forbidden-set distance labels. In 44th STOC , pages 1199–1218, 2012.
