Local Versus Global Distances for Zigzag Persistence Modules
Ellen Gasparovic, Maria Gommel, Emilie Purvine, Radmila Sazdanovic,, Bei Wang, Yusu Wang, Lori Ziegelmeier

TL;DR
This paper explores the relationship between local and global distances in zigzag persistence modules, showing bounds on bottleneck distances and discussing implications for metric graph distances and multiparameter modules.
Contribution
It establishes explicit bounds connecting local and global persistence distances, with applications to metric graphs and multiparameter persistence modules.
Findings
Bottleneck distance between restricted and unrestricted modules is bounded.
Results have practical implications for metric graph analysis.
Extension to matching distance in multiparameter persistence modules.
Abstract
This short note establishes explicit and broadly applicable relationships between persistence-based distances computed locally and globally. In particular, we show that the bottleneck distance between two zigzag persistence modules restricted to an interval is always bounded above by the distance between the unrestricted versions. While this result is not surprising, it could have different practical implications. We give two related applications for metric graph distances, as well as an extension for the matching distance between multiparameter persistence modules.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Advanced Graph Neural Networks · Homotopy and Cohomology in Algebraic Topology
Local Versus Global Distances for Zigzag Persistence Modules
Ellen Gasparovic [email protected] Union College, Schenectady, NY
Maria Gommel [email protected] University of Iowa, Iowa City, IA
Emilie Purvine [email protected] Pacific Northwest National Laboratory, Seattle, WA
Radmila Sazdanovic [email protected] North Carolina State University, Raleigh, NC
Bei Wang [email protected] University of Utah, Salt Lake City, UT
Yusu Wang [email protected] Ohio State University, Columbus, OH
Lori Ziegelmeier [email protected] Macalester College, Saint Paul, MN
Abstract
This short note establishes explicit and broadly applicable relationships between persistence-based distances computed locally and globally. In particular, we show that the bottleneck distance between two zigzag persistence modules restricted to an interval is always bounded above by the distance between the unrestricted versions. While this result is not surprising, it could have different practical implications. We give two related applications for metric graph distances, as well as an extension for the matching distance between multiparameter persistence modules.
Keywords: zigzag persistent homology, level set zigzag, bottleneck distance, metric graphs
1 Introduction
Persistence modules and zigzag persistence
The theory of persistence modules is at the core of topological data analysis. The theory begins with the study of 1-parameter persistence modules over -valued functions. In the ordinary setting, given a diagram of topological spaces connected via inclusion maps,
[TABLE]
we apply the -dimensional homology functor with coefficients in a field to obtain a diagram of vector spaces with linear maps,
[TABLE]
where . Such a diagram is called a 1-parameter persistence module [9]. Various persistence modules generalizing the 1-parameter setting have been studied in the literature, including generalized [7] (i.e., over posets), zigzag [6, 9] persistence modules, and multiparameter [22] (i.e., over -valued functions); see [8] for a description of their relationships.
We focus on zigzag persistence modules, which, in a nutshell, allow arrows to point in either direction [9]. Given a diagram of topological spaces connected by inclusion maps,
[TABLE]
we apply the homology functor as usual to obtain a sequence of vector spaces and linear maps,
[TABLE]
where each represents either a forward or a backward map. Zigzag persistence modules generalize the classic 1-parameter setting and handle several situations which are not covered by the classic theory. Linearity allows a zigzag persistence module (similar to a 1-parameter persistence module) to be uniquely decomposed into elementary pieces (called indecomposable modules) which are intervals. The information encoded by these intervals can be combinatorially represented by the persistence diagram [19]. In the case of multiparameter persistence, such indecomposable modules are complex and no longer intervals. We are interested in zigzag persistence as it involves the most general type of linear module that still gives rise to classic persistence diagrams. Furthermore, a zigzag persistence module can be used to compute ordinary persistent homology with good space efficiency (see Section 3 for details).
To measure the distance between persistence modules, the notion of interleaving distance has been employed [12] which captures the proximity between persistence modules. For 1-parameter persistence modules, it has been shown that the interleaving distance is equal to the well-known bottleneck distance [14] between the persistence diagrams of the corresponding persistence modules [22]. In this paper, we prove a straightforward inequality involving the bottleneck distance between persistence diagrams [14] that is useful for data analysis.
Global versus local perspectives on persistence
We are motivated by the study of persistence modules from both global and local perspectives. A persistence module provides a global description of a complex dataset, and we are interested in quantifying the amount of information that is preserved when restricted to local neighborhoods or intervals.
For a first example, consider the question of determining or approximating graph motif counts. A graph motif is a subgraph on a small number of vertices contained within a larger, more complex graph. Graph motifs have proven useful for characterizing networks in domains like biology [25] and cyber security [20]. The standard problem of counting the number of small motifs or patterns within a graph is equivalent to the subgraph isomorphism problem, which is NP-complete. Since restricted persistence modules reveal information about the local structure of a space, we posit that the restricted modules for a metric graph (see Section 4) can be used similarly to how graph motifs are currently used, e.g., as inputs to classification algorithms or anomaly detection algorithms in time-varying data [20, 23].
For a second example, consider persistent local homology, which studies a multi-scale notion of homology within a local neighborhood of the data relative to its boundary. It has applications in road network analysis [2], local dimension estimation [16], data visualization [26], graph reconstruction [13, 1], clustering and stratification learning [5, 3]. Furthermore, persistent local homology extracts local geometric and topological information in data, which can be used as input to machine learning algorithms [4].
Our contributions
We show that the bottleneck distance between two zigzag persistence modules restricted over an interval of parameter values is always bounded by the distance between the unrestricted versions (Theorem 2) and state a corollary in the case of level set zigzag persistence (Corollary 4). We also establish two results involving distance inequalities in the special case of metric graphs (Corollary 5 and Corollary 6) and point out how our results can be extended to multiparameter persistence modules.
The results in this short paper have the potential for many diverse applications across different settings. For instance, if one wishes to compare the persistence profiles of two very large data sets but finds that it is prohibitively computationally expensive, one has the option to compute a restricted version of the bottleneck distance as an approximation to the global distance. As the interval size increases, the bottleneck distance between the restricted versions approaches the distance for the global versions.
Relatedly, it may be the case that two long zigzag sequences need to be compared on a local scale. The question may be: are there any local differences between the two zigzag sequences? One could do many local comparisons to answer this question. However, our result means that a small global distance between the two zigzag persistence diagrams implies small local distances. To save computation one could compute the global distance as a first step. Local distances only need to be computed if the global distance is large.
Restricted persistence modules may be helpful for analyzing time-varying systems. Given data at time (e.g., a graph, function, or point cloud), a zigzag persistence module can be constructed for the sequence
[TABLE]
where all of the maps are inclusion maps. A subinterval of this sequence corresponds to a time interval contained within the larger sample. Given two long time intervals, one could either compare them in full or compare smaller windows. Our result shows that the local differences contained in small time intervals are not “washed out” as one moves to larger intervals.
The rest of the paper is organized as follows. In Section 2, we recall the necessary concepts for zigzag persistence. Our main theorem is contained in Section 3, and we consider applications of the theorem in the metric graph setting and for multi-parameter persistence in Section 4. We conclude with a discussion of future work in Section 5.
2 Brief Background and Definitions
Our treatment of zigzag persistence is brief; for more details, see [9] and [10]. A zigzag diagram of topological spaces is a sequence
[TABLE]
where each bidirectional arrow between two topological spaces represents a continuous function mapping either forwards or backwards. Applying the -th homology functor with coefficients in a field yields a zigzag diagram of vector spaces
[TABLE]
known as a zigzag module, denoted as X, from which zigzag persistence may be computed. A zigzag module decomposes into intervals , where each is defined as
[TABLE]
with nonzero values in the range . We will use to denote the resulting persistence diagram of a fixed homology dimension . By Proposition 2.12 of [9], restricting the module X to the range (denoted ) yields a decomposition as the direct sum of the intervals in X restricted to ; that is,
[TABLE]
The bottleneck distance between two persistence diagrams is equal to if there exists a matching between the points of the two diagrams (where points are allowed to be matched to diagonal elements) such that any pair of matched points are at distance at most . Formally, for a fixed homology dimension, the bottleneck distance is given by
[TABLE]
where ranges over all bijections between the two diagrams [18].
We conclude this section by defining a projection map that keeps track of the points in the global persistence diagram that disappear in the restricted version. The validity of the projection map in the following definition is guaranteed by Proposition 2.12 of [9] which leads to equation (1).
Definition 1**.**
Given , we let denote the restriction of the persistence diagram to the interval defined via the following projection map:
[TABLE]
Typically, a persistence diagram is considered to be a set of points for which . In order to compute the bottleneck distance, one adds countably many copies of the diagonal , which may intuitively correspond to topological features that are born and simultaneously die (and thus, never really exist at all). This allows for a point in one persistence diagram to be matched to the diagonal if it is far away from any point in the other diagram, and also accounts for the fact that two persistence diagrams may have different numbers of off-diagonal points. Notice that points like and in the above figure correspond to features that are born and die outside of the interval (either completely before or completely after). The restriction result cited above from [9], defining , would not include points or in its diagram. But, since both and are on the diagonal, including them in does not change the bottleneck distance between two restricted diagrams.
3 Bottleneck Distance in the Local vs. Global Settings
In this section, we prove our main result relating the bottleneck distance between persistence diagrams with the bottleneck distance between their interval-restricted versions.
Theorem 2**.**
Let and be two sequences of topological spaces and continuous maps, and let and be their corresponding zigzag persistence diagrams. Consider the interval and let and be the restrictions of these diagrams to . Then
Proof.
Let be a partial matching. For computation of the bottleneck distance, we say that any unpaired point in one of the persistence diagrams is matched to the nearest point (in the norm) on the diagonal .
Consider defined such that, for each , we have . We claim that this is a valid partial matching between the two restricted diagrams. A partial matching means that no two points are matched to the same , and similarly the same is not matched to two different points . We show the first case and remark that the second case is proved similarly. Assume, for the sake of contradiction, that . By definition of we must have such that , , and . Recall that persistence diagrams are multisets, so the fact that would indicate that there are two copies of the point in , and that is matched to one copy and is matched to the other copy. The only case in which we wouldn’t have two copies of is if , but this would contradict being a partial matching111Of course we could have if they have the same coordinates, but if there are multiple copies of the same point, we count them as different points in the persistence diagram, and thus not equal..
What is left to show is that the maximal distance between matched points is less than that for , a fact proved in the following lemma. Indeed, if is the matching that achieves the bottleneck distance between and and the cost of is smaller, then the bottleneck distance between and will only be smaller still. ∎
Lemma 3**.**
For the partial matching ,
Proof.
Consider two points and achieving . A case analysis of the 21 possible pairings of points will establish the lemma. First, observe that if either or is in Case A, then after projecting onto the restricted region, at least one point is unchanged and at most one point is moved closer, yielding the desired inequality.
Next, we will consider the scenarios when one of the points, say (without loss of generality) , belongs to Case B, so that . If is also a Case B point, then the inequality holds because projecting the points does not change the horizontal distance and the vertical distance of the projection is 0. In the case that , we have and . Since and , the horizontal distances satisfy and the vertical distances satisfy , so that the inequality holds. If , then and , since the vertical distance between the projections is 0. This in turn is less than . Now, if , we have and . Since , this implies that the horizontal distance between the projections must be larger than the vertical distance. Therefore, . Finally, if , then and . Since , the horizontal distances satisfy and the vertical distances satisfy , yielding the desired inequality.
The case analysis for the remaining pairings proceeds in a similar manner. ∎
Given an -valued function, there is a natural construction of a level set zigzag (LZZ) persistence module [10] that sweeps its level sets from bottom to top [19]. Given a topological space and a continuous function of Morse type, let denote the level set of for any and denote the slice of which maps to the interval . If , we may denote this as . Recall that is of Morse type if, for the finite set of critical values of , the open intervals are such that for each interval , is homeomorphic to for some compact and locally connected space with serving as the projection onto [10]. The homeomorphisms should extend to continuous functions on , where is the closure of in , and each should also have finitely-generated homology. Then, given of Morse type with critical values as above, we choose arbitrary satisfying
[TABLE]
The level set zigzag persistence of is defined to be the zigzag persistence for the sequence
[TABLE]
We denote the persistence diagram by .
The level set zigzag persistence can be used to compute the ordinary persistent homology of an -valued function with good space efficiency. In particular, the LZZ module is related to the ordinary (extended) persistence module via the Mayer-Vietoris pyramid [10, Figure 3], where the zigzag sequence and the ordinary sequence are shown to contain the same information in their persistent homology. Therefore, we could use the algorithm for zigzag persistent homology to compute extended persistence, while using space that depends only on the size of the largest level set instead of the entire domain [10, 24].
We now state a straightforward corollary to Theorem 2 which we will use in Section 4.
Corollary 4**.**
Let and be Morse type functions defined on topological spaces and , and for an interval , let and be the restrictions of the LZZ persistence diagrams and to the interval . Then
4 Applications to Metric Graphs and -Parameter Persistence
For uses of Corollary 4, we turn to the metric graph setting. Metric graphs commonly arise when studying road networks as well as biological or chemical structure graphs. Given a graph with a set of vertices and edges, a length function on the edges, and a geometric realization of the graph, one may specify a metric on by taking the minimum length of any path between any pair of points (not necessarily vertices) in the geometric realization. Given a base point , the geodesic distance function is given by . Then denotes the [math]-dimensional LZZ persistence diagram induced by . Equivalently, is the union of the [math]- and -dimensional extended persistence diagrams for (see [15] for the details of extended persistence). Corollary 4 can be used to compare local neighborhoods of two different metric graphs, and , with base points and . In particular, given and , we have for any real interval . Typically, for comparing local neighborhoods, . The following corollary gives a stability-type result for comparing two local neighborhoods within a single metric graph.
Corollary 5**.**
Let be a metric graph with geometric realization . For a fixed interval and points , we have .
Proof.
By Corollary 4, . Since are two Morse type functions, by the LZZ Stability Theorem of [10]. Furthermore, by the triangle inequality, for any , , meaning that . Putting everything together proves the claim. ∎
Another application of Corollary 4 is as follows. Define , where denotes the space of persistence diagrams. Given metric graphs and , their persistence distortion distance [17] is
[TABLE]
where denotes the Hausdorff distance. In other words,
[TABLE]
Note that the diagram contains both [math]- and -dimensional persistence points, but only points of the same dimension are matched under the bottleneck distance. A local version of the persistence distortion distance, which we will denote by , may be defined as follows: for each base point , only consider the distance function to points within a fixed intrinsic radius .
Corollary 6**.**
If , then
Proof.
Let be the persistence diagram for some base point , where the geodesic distance function is computed in the interval . Let be the persistence diagram for the same base point, but where the distance function is computed in the interval . Define and similarly for some base point in . By viewing as a restriction of for , we can apply Theorem 2 to show that . Since our choice of base points was arbitrary, this inequality holds for persistence diagrams across all choices of base points in and . Therefore, using the definition of the local version of the persistence distortion distance, we can conclude that ∎
We end with a final remark on how Theorem 2 can be applied to a -parameter persistence module on any topological space (not restricted to the level set or metric graph settings). A -parameter persistence module is indexed by a -dimensional family of vector spaces, , together with a family of linear maps such that for , we have and [11]. Here, if and only if for . Any line in the set of all lines of with direction such that is strictly positive gives a one-parameter slice of the -parameter persistence module. Given two -parameter persistence modules X and Y, we define their matching distance [21] to be
[TABLE]
where and are the persistence diagrams of the -parameter persistence modules X and Y restricted along line . Our result extends naturally to this linear relationship between these two parameters. Indeed, if we restrict both -parameter persistence modules to a region , where each is an interval of the real line, then Theorem 2 implies the following corollary.
Corollary 7**.**
[TABLE]
where is computed by restricting and to the subinterval of the line passing through the region .
Proof.
For a fixed line with direction m, consider a region restricted to , denoted . Recall that and are the persistence diagrams of the -parameter persistence modules X and Y restricted along the line . Based on Theorem 2,
[TABLE]
From the definition of supremum, we know that , there is a line such that
[TABLE]
Using observation (2) above, we see that
[TABLE]
The right-hand side is, of course, less than the supremum over all lines , the definition of . Hence, for every , we have ; in other words, , as desired. ∎
5 Discussion
Theorem 2 and its corollaries provide explicit relationships between distances computed locally and globally, and the resulting inequalities are very broadly applicable. For instance, the fact that the local bottleneck distance is bounded above by the global bottleneck distance allows for a single global computation to potentially rule out local differences if the global distance is low. If looking for local differences, starting with a global computation may save computational time if there are too many local comparisons to make. On the other hand, the global bottleneck distance being bounded below by the local version allows smaller computations to approach the global truth, while perhaps being more computationally tractable.
In future work, we would like to extend these ideas to generalized persistence, where instead of a linear sequence of topological spaces one considers topological spaces and transformations that form a poset. In contrast to zigzag persistence, this generalized persistence does not have the notion of a persistence diagram. Instead, we would need to restate our results in terms of the interleaving distance between persistence modules. Moreover, a notion of “local” would have to be defined in the poset setting.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Aanjaneya, F. Chazal, D. Chen, M. Glisse, L. Guibas, and D. Morozov. Metric graph reconstruction from noisy data. International Journal of Computational Geometry & Applications , 22(04):305–325, 2012.
- 2[2] M. Ahmed, B. T. Fasy, and C. Wenk. Local persistent homology based distance between maps. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , SIGSPATIAL ’14, pages 43–52, New York, NY, USA, 2014. ACM.
- 3[3] P. Bendich, D. Cohen-Steiner, H. Edelsbrunner, J. Harer, and D. Morozov. Inferring local homology from sampled stratified spaces. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07) , pages 536–546, Oct 2007.
- 4[4] P. Bendich, E. Gasparovic, J. Harer, R. Izmailov, and L. Ness. Multi-scale local shape analysis and feature selection in machine learning applications. In 2015 International Joint Conference on Neural Networks (IJCNN) , pages 1–8, July 2015.
- 5[5] P. Bendich, B. Wang, and S. Mukherjee. Local homology transfer and stratification learning. ACM-SIAM Symposium on Discrete Algorithms , pages 1355–1370, 2012.
- 6[6] M. B. Botnan and M. Lesnick. Algebraic stability of zigzag persistence modules. Algebraic & Geometric Topology , 18(6):3133–3204, 2018.
- 7[7] P. Bubenik, V. de Silva, and J. Scott. Metrics for generalized persistence modules. Foundations of Computational Mathematics , 15(6):1501–1531, 2015.
- 8[8] P. Bubenik and T. Vergili. Topological spaces of persistence modules and their properties. Ar Xiv:1802.08117, 2018.
