Local Versus Global Distances for Zigzag Persistence Modules

Ellen Gasparovic; Maria Gommel; Emilie Purvine; Radmila Sazdanovic,; Bei Wang; Yusu Wang; Lori Ziegelmeier

arXiv:1903.08298·math.AT·March 21, 2019

Local Versus Global Distances for Zigzag Persistence Modules

Ellen Gasparovic, Maria Gommel, Emilie Purvine, Radmila Sazdanovic,, Bei Wang, Yusu Wang, Lori Ziegelmeier

PDF

Open Access

TL;DR

This paper explores the relationship between local and global distances in zigzag persistence modules, showing bounds on bottleneck distances and discussing implications for metric graph distances and multiparameter modules.

Contribution

It establishes explicit bounds connecting local and global persistence distances, with applications to metric graphs and multiparameter persistence modules.

Findings

01

Bottleneck distance between restricted and unrestricted modules is bounded.

02

Results have practical implications for metric graph analysis.

03

Extension to matching distance in multiparameter persistence modules.

Abstract

This short note establishes explicit and broadly applicable relationships between persistence-based distances computed locally and globally. In particular, we show that the bottleneck distance between two zigzag persistence modules restricted to an interval is always bounded above by the distance between the unrestricted versions. While this result is not surprising, it could have different practical implications. We give two related applications for metric graph distances, as well as an extension for the matching distance between multiparameter persistence modules.

Equations41

X_{1} \to X_{2} \to \dots \to X_{n},

X_{1} \to X_{2} \to \dots \to X_{n},

V_{1} \to V_{2} \to \dots \to V_{n},

V_{1} \to V_{2} \to \dots \to V_{n},

X_{1} \leftrightarrow X_{2} \leftrightarrow \dots \leftrightarrow X_{n},

X_{1} \leftrightarrow X_{2} \leftrightarrow \dots \leftrightarrow X_{n},

V_{1} \leftrightarrow V_{2} \leftrightarrow \dots \leftrightarrow V_{n},

V_{1} \leftrightarrow V_{2} \leftrightarrow \dots \leftrightarrow V_{n},

X_{1} \to X_{1} \cup X_{2} \leftarrow X_{2} \to X_{2} \cup X_{3} \leftarrow \dots

X_{1} \to X_{1} \cup X_{2} \leftarrow X_{2} \to X_{2} \cup X_{3} \leftarrow \dots

X_{1} \leftrightarrow X_{2} \leftrightarrow \dots \leftrightarrow X_{n}

X_{1} \leftrightarrow X_{2} \leftrightarrow \dots \leftrightarrow X_{n}

H_{p} (X_{1}) \leftrightarrow H_{p} (X_{2}) \leftrightarrow \dots \leftrightarrow H_{p} (X_{n}),

H_{p} (X_{1}) \leftrightarrow H_{p} (X_{2}) \leftrightarrow \dots \leftrightarrow H_{p} (X_{n}),

0 ⟷ \dots ⟷ 0 ⟷ K ⟷ \dots ⟷ K ⟷ 0 \dots ⟷ 0

0 ⟷ \dots ⟷ 0 ⟷ K ⟷ \dots ⟷ K ⟷ 0 \dots ⟷ 0

X [r_{1}, r_{2}] ≅ j \in J ⨁ I ([b_{j}, d_{j}] \cap [r_{1}, r_{2}]) .

X [r_{1}, r_{2}] ≅ j \in J ⨁ I ([b_{j}, d_{j}] \cap [r_{1}, r_{2}]) .

d_{B} (Dg X, Dg Y) = μ in f x sup ∣∣ x - μ (x) ∣ ∣_{\infty},

d_{B} (Dg X, Dg Y) = μ in f x sup ∣∣ x - μ (x) ∣ ∣_{\infty},

Π : Dg X

Π : Dg X

(b, d)

- \infty < s_{0} < a_{1} < s_{1} < a_{2} < \dots < s_{n - 1} < a_{n} < s_{n} < \infty.

- \infty < s_{0} < a_{1} < s_{1} < a_{2} < \dots < s_{n - 1} < a_{n} < s_{n} < \infty.

X_{s_{0}}^{s_{0}} \to X_{s_{0}}^{s_{1}} \leftarrow X_{s_{1}}^{s_{1}} \to X_{s_{1}}^{s_{2}} \leftarrow \dots \to X_{s_{n - 1}}^{s_{n}} \leftarrow X_{s_{n}}^{s_{n}} .

X_{s_{0}}^{s_{0}} \to X_{s_{0}}^{s_{1}} \leftarrow X_{s_{1}}^{s_{1}} \to X_{s_{1}}^{s_{2}} \leftarrow \dots \to X_{s_{n - 1}}^{s_{n}} \leftarrow X_{s_{n}}^{s_{n}} .

d_{P D} (G_{1}, G_{2}) := d_{H} (Φ (∣ G_{1} ∣), Φ (∣ G_{2} ∣)),

d_{P D} (G_{1}, G_{2}) := d_{H} (Φ (∣ G_{1} ∣), Φ (∣ G_{2} ∣)),

d_{P D} (G_{1}, G_{2}) = max {D_{1} \in Φ (∣ G_{1} ∣) sup D_{2} \in Φ (∣ G_{2} ∣) inf \operator@font p d_{B} (D_{1}, D_{2}), D_{2} \in Φ (∣ G_{2} ∣) sup D_{1} \in Φ (∣ G_{1} ∣) inf \operator@font p d_{B} (D_{1}, D_{2})} .

d_{P D} (G_{1}, G_{2}) = max {D_{1} \in Φ (∣ G_{1} ∣) sup D_{2} \in Φ (∣ G_{2} ∣) inf \operator@font p d_{B} (D_{1}, D_{2}), D_{2} \in Φ (∣ G_{2} ∣) sup D_{1} \in Φ (∣ G_{1} ∣) inf \operator@font p d_{B} (D_{1}, D_{2})} .

d_{ma t c h} (X, Y) := L sup i min m_{i} d_{B} (Dg X_{L}, Dg Y_{L}),

d_{ma t c h} (X, Y) := L sup i min m_{i} d_{B} (Dg X_{L}, Dg Y_{L}),

d_{ma t c h} (X^{I}, Y^{I}) \leq d_{ma t c h} (X, Y)

d_{ma t c h} (X^{I}, Y^{I}) \leq d_{ma t c h} (X, Y)

d_{B} (Dg X_{L}^{I_{L}}, Dg Y_{L}^{I_{L}}) \leq d_{B} (Dg X_{L}, Dg Y_{L}) .

d_{B} (Dg X_{L}^{I_{L}}, Dg Y_{L}^{I_{L}}) \leq d_{B} (Dg X_{L}, Dg Y_{L}) .

d_{ma t c h} (X^{I}, Y^{I}) - ϵ < i min m_{i} d_{B} (Dg X_{L_{ϵ}}^{I_{L_{ϵ}}}, Dg Y_{L_{ϵ}}^{I_{L_{ϵ}}}) .

d_{ma t c h} (X^{I}, Y^{I}) - ϵ < i min m_{i} d_{B} (Dg X_{L_{ϵ}}^{I_{L_{ϵ}}}, Dg Y_{L_{ϵ}}^{I_{L_{ϵ}}}) .

d_{ma t c h} (X^{I}, Y^{I}) - ϵ < i min m_{i} d_{B} (Dg X_{L_{ϵ}}, Dg Y_{L_{ϵ}}) .

d_{ma t c h} (X^{I}, Y^{I}) - ϵ < i min m_{i} d_{B} (Dg X_{L_{ϵ}}, Dg Y_{L_{ϵ}}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Advanced Graph Neural Networks · Homotopy and Cohomology in Algebraic Topology

Full text

Local Versus Global Distances for Zigzag Persistence Modules

Ellen Gasparovic [email protected] Union College, Schenectady, NY

Maria Gommel [email protected] University of Iowa, Iowa City, IA

Emilie Purvine [email protected] Pacific Northwest National Laboratory, Seattle, WA

Radmila Sazdanovic [email protected] North Carolina State University, Raleigh, NC

Bei Wang [email protected] University of Utah, Salt Lake City, UT

Yusu Wang [email protected] Ohio State University, Columbus, OH

Lori Ziegelmeier [email protected] Macalester College, Saint Paul, MN

Abstract

This short note establishes explicit and broadly applicable relationships between persistence-based distances computed locally and globally. In particular, we show that the bottleneck distance between two zigzag persistence modules restricted to an interval is always bounded above by the distance between the unrestricted versions. While this result is not surprising, it could have different practical implications. We give two related applications for metric graph distances, as well as an extension for the matching distance between multiparameter persistence modules.

Keywords: zigzag persistent homology, level set zigzag, bottleneck distance, metric graphs

1 Introduction

Persistence modules and zigzag persistence

The theory of persistence modules is at the core of topological data analysis. The theory begins with the study of 1-parameter persistence modules over ${{\mathbb{R}}}$ -valued functions. In the ordinary setting, given a diagram of topological spaces connected via inclusion maps,

[TABLE]

we apply the $p$ -dimensional homology functor ${{\mathsf{H}}}_{p}$ with coefficients in a field ${{\mathbb{K}}}$ to obtain a diagram of vector spaces with linear maps,

[TABLE]

where ${{\mathbb{V}}}_{i}={{\mathsf{H}}}_{p}({{\mathbb{X}}}_{i};{{\mathbb{K}}})$ . Such a diagram is called a 1-parameter persistence module [9]. Various persistence modules generalizing the 1-parameter setting have been studied in the literature, including generalized [7] (i.e., over posets), zigzag [6, 9] persistence modules, and multiparameter [22] (i.e., over ${{\mathbb{R}}}^{d}$ -valued functions); see [8] for a description of their relationships.

We focus on zigzag persistence modules, which, in a nutshell, allow arrows to point in either direction [9]. Given a diagram of topological spaces connected by inclusion maps,

[TABLE]

we apply the homology functor as usual to obtain a sequence of vector spaces and linear maps,

[TABLE]

where each $\leftrightarrow$ represents either a forward or a backward map. Zigzag persistence modules generalize the classic 1-parameter setting and handle several situations which are not covered by the classic theory. Linearity allows a zigzag persistence module (similar to a 1-parameter persistence module) to be uniquely decomposed into elementary pieces (called indecomposable modules) which are intervals. The information encoded by these intervals can be combinatorially represented by the persistence diagram [19]. In the case of multiparameter persistence, such indecomposable modules are complex and no longer intervals. We are interested in zigzag persistence as it involves the most general type of linear module that still gives rise to classic persistence diagrams. Furthermore, a zigzag persistence module can be used to compute ordinary persistent homology with good space efficiency (see Section 3 for details).

To measure the distance between persistence modules, the notion of interleaving distance has been employed [12] which captures the proximity between persistence modules. For 1-parameter persistence modules, it has been shown that the interleaving distance is equal to the well-known bottleneck distance [14] between the persistence diagrams of the corresponding persistence modules [22]. In this paper, we prove a straightforward inequality involving the bottleneck distance between persistence diagrams [14] that is useful for data analysis.

Global versus local perspectives on persistence

We are motivated by the study of persistence modules from both global and local perspectives. A persistence module provides a global description of a complex dataset, and we are interested in quantifying the amount of information that is preserved when restricted to local neighborhoods or intervals.

For a first example, consider the question of determining or approximating graph motif counts. A graph motif is a subgraph on a small number of vertices contained within a larger, more complex graph. Graph motifs have proven useful for characterizing networks in domains like biology [25] and cyber security [20]. The standard problem of counting the number of small motifs or patterns within a graph is equivalent to the subgraph isomorphism problem, which is NP-complete. Since restricted persistence modules reveal information about the local structure of a space, we posit that the restricted modules for a metric graph (see Section 4) can be used similarly to how graph motifs are currently used, e.g., as inputs to classification algorithms or anomaly detection algorithms in time-varying data [20, 23].

For a second example, consider persistent local homology, which studies a multi-scale notion of homology within a local neighborhood of the data relative to its boundary. It has applications in road network analysis [2], local dimension estimation [16], data visualization [26], graph reconstruction [13, 1], clustering and stratification learning [5, 3]. Furthermore, persistent local homology extracts local geometric and topological information in data, which can be used as input to machine learning algorithms [4].

Our contributions

We show that the bottleneck distance between two zigzag persistence modules restricted over an interval of parameter values is always bounded by the distance between the unrestricted versions (Theorem 2) and state a corollary in the case of level set zigzag persistence (Corollary 4). We also establish two results involving distance inequalities in the special case of metric graphs (Corollary 5 and Corollary 6) and point out how our results can be extended to multiparameter persistence modules.

The results in this short paper have the potential for many diverse applications across different settings. For instance, if one wishes to compare the persistence profiles of two very large data sets but finds that it is prohibitively computationally expensive, one has the option to compute a restricted version of the bottleneck distance as an approximation to the global distance. As the interval size increases, the bottleneck distance between the restricted versions approaches the distance for the global versions.

Relatedly, it may be the case that two long zigzag sequences need to be compared on a local scale. The question may be: are there any local differences between the two zigzag sequences? One could do many local comparisons to answer this question. However, our result means that a small global distance between the two zigzag persistence diagrams implies small local distances. To save computation one could compute the global distance as a first step. Local distances only need to be computed if the global distance is large.

Restricted persistence modules may be helpful for analyzing time-varying systems. Given data ${{\mathbb{X}}}_{t}$ at time $t$ (e.g., a graph, function, or point cloud), a zigzag persistence module can be constructed for the sequence

[TABLE]

where all of the maps are inclusion maps. A subinterval of this sequence corresponds to a time interval contained within the larger sample. Given two long time intervals, one could either compare them in full or compare smaller windows. Our result shows that the local differences contained in small time intervals are not “washed out” as one moves to larger intervals.

The rest of the paper is organized as follows. In Section 2, we recall the necessary concepts for zigzag persistence. Our main theorem is contained in Section 3, and we consider applications of the theorem in the metric graph setting and for multi-parameter persistence in Section 4. We conclude with a discussion of future work in Section 5.

2 Brief Background and Definitions

Our treatment of zigzag persistence is brief; for more details, see [9] and [10]. A zigzag diagram of topological spaces ${{\mathbb{X}}}_{1},{{\mathbb{X}}}_{2},\ldots,{{\mathbb{X}}}_{n}$ is a sequence

[TABLE]

where each bidirectional arrow between two topological spaces represents a continuous function mapping either forwards or backwards. Applying the $p$ -th homology functor with coefficients in a field $\mathbb{K}$ yields a zigzag diagram of vector spaces

[TABLE]

known as a zigzag module, denoted as X, from which zigzag persistence may be computed. A zigzag module decomposes into intervals ${{\textbf{X}}}\cong\displaystyle\bigoplus_{j\in J}{{\mathbb{I}}}[b_{j},d_{j}]$ , where each ${{\mathbb{I}}}[b_{j},d_{j}]$ is defined as

[TABLE]

with nonzero values in the range $[b_{j},d_{j}]$ . We will use $\mathrm{Dg}{{{\textbf{X}}}}$ to denote the resulting persistence diagram of a fixed homology dimension $p$ . By Proposition 2.12 of [9], restricting the module X to the range $[r_{1},r_{2}]$ (denoted ${{\textbf{X}}}[r_{1},r_{2}]$ ) yields a decomposition as the direct sum of the intervals in X restricted to $[r_{1},r_{2}]$ ; that is,

[TABLE]

The bottleneck distance between two persistence diagrams is equal to $\delta$ if there exists a matching between the points of the two diagrams (where points are allowed to be matched to diagonal elements) such that any pair of matched points are at distance at most $\delta$ . Formally, for a fixed homology dimension, the bottleneck distance is given by

[TABLE]

where $\mu$ ranges over all bijections between the two diagrams [18].

We conclude this section by defining a projection map that keeps track of the points in the global persistence diagram that disappear in the restricted version. The validity of the projection map in the following definition is guaranteed by Proposition 2.12 of [9] which leads to equation (1).

Definition 1.

Given $I=[r_{1},r_{2}]\subset{{\mathbb{R}}}$ , we let $\mathrm{Dg}{{{\textbf{X}}}^{I}}$ denote the restriction of the persistence diagram $\mathrm{Dg}{{{\textbf{X}}}}$ to the interval $I$ defined via the following projection map:

[TABLE]

Typically, a persistence diagram is considered to be a set of points $\{(b,d)\}$ for which $b<d$ . In order to compute the bottleneck distance, one adds countably many copies of the diagonal $\{(x,x):x\in\mathbb{R}\}$ , which may intuitively correspond to topological features that are born and simultaneously die (and thus, never really exist at all). This allows for a point in one persistence diagram to be matched to the diagonal if it is far away from any point in the other diagram, and also accounts for the fact that two persistence diagrams may have different numbers of off-diagonal points. Notice that points like $E$ and $F$ in the above figure correspond to features that are born and die outside of the interval $I$ (either completely before or completely after). The restriction result cited above from [9], defining ${{\mathbb{U}}}[r_{1},r_{2}]$ , would not include points $\Pi(E)$ or $\Pi(F)$ in its diagram. But, since both $\Pi(E)$ and $\Pi(F)$ are on the diagonal, including them in $\mathrm{Dg}{{{\textbf{X}}}}^{I}$ does not change the bottleneck distance between two restricted diagrams.

3 Bottleneck Distance in the Local vs. Global Settings

In this section, we prove our main result relating the bottleneck distance between persistence diagrams with the bottleneck distance between their interval-restricted versions.

Theorem 2.

Let ${{\mathbb{X}}}_{1}\leftrightarrow{{\mathbb{X}}}_{2}\leftrightarrow\ldots\leftrightarrow{{\mathbb{X}}}_{n}$ and ${{\mathbb{Y}}}_{1}\leftrightarrow{{\mathbb{Y}}}_{2}\leftrightarrow\ldots\leftrightarrow{{\mathbb{Y}}}_{n}$ be two sequences of topological spaces and continuous maps, and let $\mathrm{Dg}{{{\textbf{X}}}}$ and $\mathrm{Dg}{{{\textbf{Y}}}}$ be their corresponding zigzag persistence diagrams. Consider the interval $I=[r_{1},r_{2}]\subset{{\mathbb{R}}}$ and let $\mathrm{Dg}{{{\textbf{X}}}^{I}}$ and $\mathrm{Dg}{{{\textbf{Y}}}^{I}}$ be the restrictions of these diagrams to $I$ . Then $d_{B}(\mathrm{Dg}{{{\textbf{X}}}^{I}},\mathrm{Dg}{{{\textbf{Y}}}^{I}})\leq d_{B}(\mathrm{Dg}{{{\textbf{X}}}},\mathrm{Dg}{{{\textbf{Y}}}}).$

Proof.

Let $\mu\subseteq\mathrm{Dg}{{{\textbf{X}}}}\times\mathrm{Dg}{{{\textbf{Y}}}}$ be a partial matching. For computation of the bottleneck distance, we say that any unpaired point in one of the persistence diagrams is matched to the nearest point (in the $L_{\infty}$ norm) on the diagonal $\Delta=\{(x,x):x\in{{\mathbb{R}}}\}$ .

Consider $\hat{\mu}\subseteq\mathrm{Dg}{{{\textbf{X}}}^{I}}\times\mathrm{Dg}{{{\textbf{Y}}}^{I}}$ defined such that, for each $(p,q)\in\mu$ , we have $(\Pi(p),\Pi(q))\in\hat{\mu}$ . We claim that this is a valid partial matching between the two restricted diagrams. A partial matching means that no two points $\hat{p},\hat{p}^{\prime}\in\mathrm{Dg}{{{\textbf{X}}}^{I}}$ are matched to the same $\hat{q}\in\mathrm{Dg}{{{\textbf{Y}}}^{I}}$ , and similarly the same $\hat{p}\in\mathrm{Dg}{{{\textbf{X}}}^{I}}$ is not matched to two different points $\hat{q},\hat{q}^{\prime}\in\mathrm{Dg}{{{\textbf{Y}}}^{I}}$ . We show the first case and remark that the second case is proved similarly. Assume, for the sake of contradiction, that $(\hat{p},\hat{q}),(\hat{p}^{\prime},\hat{q})\in\hat{\mu}$ . By definition of $\hat{\mu}$ we must have $(p,q),(p^{\prime},q^{\prime})\in\mu$ such that $\hat{p}=\Pi(p)$ , $\hat{p}^{\prime}=\Pi(p^{\prime})$ , and $\hat{q}=\Pi(q)=\Pi(q^{\prime})$ . Recall that persistence diagrams are multisets, so the fact that $\hat{q}=\Pi(q)=\Pi(q^{\prime})$ would indicate that there are two copies of the point $\hat{q}$ in $\hat{\mu}$ , and that $\hat{p}$ is matched to one copy and $\hat{p}^{\prime}$ is matched to the other copy. The only case in which we wouldn’t have two copies of $\hat{q}$ is if $q=q^{\prime}$ , but this would contradict $\mu$ being a partial matching111Of course we could have $q=q^{\prime}$ if they have the same coordinates, but if there are multiple copies of the same $(b,d)$ point, we count them as different points in the persistence diagram, and thus not equal..

What is left to show is that the maximal distance between matched points $\hat{\mu}$ is less than that for $\mu$ , a fact proved in the following lemma. Indeed, if $\mu$ is the matching that achieves the bottleneck distance between $\mathrm{Dg}{{{\textbf{X}}}}$ and $\mathrm{Dg}{{{\textbf{Y}}}}$ and the cost of $\hat{\mu}$ is smaller, then the bottleneck distance between $\mathrm{Dg}{{{\textbf{X}}}^{I}}$ and $\mathrm{Dg}{{{\textbf{Y}}}^{I}}$ will only be smaller still. ∎

Lemma 3.

For the partial matching $\hat{\mu}$ , $\displaystyle\sup_{(\hat{p},\hat{q})\in\hat{\mu}}||\hat{p}-\hat{q}||_{\infty}\leq\displaystyle\sup_{(p,q)\in\mu}||p-q||_{\infty}.$

Proof.

Consider two points $p\in\mathrm{Dg}{{{\textbf{X}}}}$ and $q\in\mathrm{Dg}{{{\textbf{Y}}}}$ achieving $\displaystyle\sup_{(p,q)\in\mu}||p-q||_{\infty}$ . A case analysis of the 21 possible pairings of points will establish the lemma. First, observe that if either $p$ or $q$ is in Case A, then after projecting onto the restricted region, at least one point is unchanged and at most one point is moved closer, yielding the desired inequality.

Next, we will consider the scenarios when one of the points, say (without loss of generality) $q=(b_{q},d_{q})$ , belongs to Case B, so that $\Pi(q)=(b_{q},r_{2})$ . If $p=(b_{p},d_{p})$ is also a Case B point, then the inequality holds because projecting the points does not change the horizontal distance and the vertical distance of the projection is 0. In the case that $p=(b_{p},d_{p})\in\textbf{Case C}$ , we have $\Pi(p)=(r_{1},d_{p})$ and $||\Pi(p)-\Pi(q)||_{\infty}=\max\{b_{q}-r_{1},r_{2}-d_{p}\}$ . Since $b_{p}\leq r_{1}\leq d_{p}$ and $b_{q}\leq r_{2}\leq d_{q}$ , the horizontal distances satisfy $b_{q}-r_{1}\leq b_{q}-b_{p}$ and the vertical distances satisfy $r_{2}-d_{p}\leq d_{q}-d_{p}$ , so that the inequality holds. If $p=(b_{p},d_{p})\in\textbf{Case D}$ , then $\Pi(p)=(r_{1},r_{2})$ and $||\Pi(p)-\Pi(q)||_{\infty}=b_{q}-r_{1}$ , since the vertical distance between the projections is 0. This in turn is less than $b_{q}-b_{p}\leq||p-q||_{\infty}$ . Now, if $p=(b_{p},d_{p})\in\textbf{Case E}$ , we have $\Pi(p)=(b_{p},b_{p})$ and $||\Pi(p)-\Pi(q)||_{\infty}=\max\{|b_{p}-b_{q}|,b_{p}-r_{2}\}$ . Since $b_{q}\leq r_{2}\leq b_{p}$ , this implies that the horizontal distance between the projections must be larger than the vertical distance. Therefore, $||\Pi(p)-\Pi(q)||_{\infty}=b_{p}-b_{q}\leq\max\{b_{p}-b_{q},|d_{p}-d_{q}|\}=||p-q||_{\infty}$ . Finally, if $p=(b_{p},d_{p})\in\textbf{Case F}$ , then $\Pi(p)=(d_{p},d_{p})$ and $||\Pi(p)-\Pi(q)||_{\infty}=\max\{b_{q}-d_{p},r_{2}-d_{p}\}$ . Since $b_{p}\leq d_{p}\leq r_{1}\leq b_{q}\leq r_{2}\leq d_{q}$ , the horizontal distances satisfy $b_{q}-d_{p}\leq b_{q}-b_{p}$ and the vertical distances satisfy $r_{2}-d_{p}\leq d_{q}-d_{p}$ , yielding the desired inequality.

The case analysis for the remaining pairings proceeds in a similar manner. ∎

Given an ${{\mathbb{R}}}$ -valued function, there is a natural construction of a level set zigzag (LZZ) persistence module [10] that sweeps its level sets from bottom to top [19]. Given a topological space ${{\mathbb{X}}}$ and a continuous function $f:{{\mathbb{X}}}\rightarrow\mathbb{R}$ of Morse type, let ${{\mathbb{X}}}_{t}=f^{-1}(t)$ denote the level set of $f$ for any $t\in\mathbb{R}$ and ${{\mathbb{X}}}_{I}=f^{-1}(I)$ denote the slice of ${{\mathbb{X}}}$ which $f$ maps to the interval $I\subset{{\mathbb{R}}}$ . If $I=[a,b]$ , we may denote this as ${{\mathbb{X}}}_{a}^{b}$ . Recall that $({{\mathbb{X}}},f)$ is of Morse type if, for the finite set of critical values $a_{1}<a_{2}<\ldots<a_{n}$ of $f$ , the open intervals $(-\infty,a_{1}),(a_{1},a_{2}),\ldots,(a_{n-1},a_{n}),(a_{n},\infty)$ are such that for each interval $I$ , $f^{-1}(I)$ is homeomorphic to ${{\mathbb{Y}}}\times I$ for some compact and locally connected space ${{\mathbb{Y}}}$ with $f$ serving as the projection onto $I$ [10]. The homeomorphisms should extend to continuous functions on ${{\mathbb{Y}}}\times\bar{I}$ , where $\bar{I}$ is the closure of $I$ in ${{\mathbb{R}}}$ , and each ${{\mathbb{X}}}_{t}$ should also have finitely-generated homology. Then, given $({{\mathbb{X}}},f)$ of Morse type with critical values $a_{i}$ as above, we choose arbitrary $s_{i}$ satisfying

[TABLE]

The level set zigzag persistence of $({{\mathbb{X}}},f)$ is defined to be the zigzag persistence for the sequence

[TABLE]

We denote the persistence diagram by $\mathrm{Dg}{f}$ .

The level set zigzag persistence can be used to compute the ordinary persistent homology of an ${{\mathbb{R}}}$ -valued function with good space efficiency. In particular, the LZZ module is related to the ordinary (extended) persistence module via the Mayer-Vietoris pyramid [10, Figure 3], where the zigzag sequence and the ordinary sequence are shown to contain the same information in their persistent homology. Therefore, we could use the algorithm for zigzag persistent homology to compute extended persistence, while using space that depends only on the size of the largest level set instead of the entire domain [10, 24].

We now state a straightforward corollary to Theorem 2 which we will use in Section 4.

Corollary 4.

Let $f:{{\mathbb{X}}}\to{{\mathbb{R}}}$ and $g:{{\mathbb{Y}}}\to{{\mathbb{R}}}$ be Morse type functions defined on topological spaces ${{\mathbb{X}}}$ and ${{\mathbb{Y}}}$ , and for an interval $I=[r_{1},r_{2}]$ , let $\mathrm{Dg}{f^{I}}$ and $\mathrm{Dg}{g^{I}}$ be the restrictions of the LZZ persistence diagrams $\mathrm{Dg}{f}$ and $\mathrm{Dg}{g}$ to the interval $I$ . Then $d_{B}(\mathrm{Dg}{f^{I}},\mathrm{Dg}{g^{I}})\leq d_{B}(\mathrm{Dg}{f},\mathrm{Dg}{g}).$

4 Applications to Metric Graphs and $d$ -Parameter Persistence

For uses of Corollary 4, we turn to the metric graph setting. Metric graphs commonly arise when studying road networks as well as biological or chemical structure graphs. Given a graph $G$ with a set of vertices and edges, a length function on the edges, and a geometric realization $|G|$ of the graph, one may specify a metric on $G$ by taking the minimum length of any path between any pair of points (not necessarily vertices) in the geometric realization. Given a base point $v\in|G|$ , the geodesic distance function $f_{v}:|G|\rightarrow\mathbb{R}$ is given by $f_{v}(x)=d_{G}(v,x)$ . Then $\mathrm{Dg}{f_{v}}$ denotes the [math]-dimensional LZZ persistence diagram induced by $f_{v}$ . Equivalently, $\mathrm{Dg}{f_{v}}$ is the union of the [math]- and $1$ -dimensional extended persistence diagrams for $f_{v}$ (see [15] for the details of extended persistence). Corollary 4 can be used to compare local neighborhoods of two different metric graphs, $G_{1}$ and $G_{2}$ , with base points $v\in G_{1}$ and $u\in G_{2}$ . In particular, given $f_{v}:|G_{1}|\rightarrow\mathbb{R}$ and $g_{u}:|G_{2}|\rightarrow\mathbb{R}$ , we have $d_{B}(\mathrm{Dg}{f_{v}}^{I},\mathrm{Dg}{g_{u}}^{I})\leq d_{B}(\mathrm{Dg}{f_{v}},\mathrm{Dg}{g_{u}})$ for any real interval $I$ . Typically, for comparing local neighborhoods, $I=[0,r]$ . The following corollary gives a stability-type result for comparing two local neighborhoods within a single metric graph.

Corollary 5.

Let $G$ be a metric graph with geometric realization $|G|$ . For a fixed interval $I$ and points $u,v\in|G|$ , we have $d_{B}(\mathrm{Dg}{f_{u}^{I}},\mathrm{Dg}{f_{v}^{I}})\leq d_{G}(u,v)$ .

Proof.

By Corollary 4, $d_{B}(\mathrm{Dg}{f_{u}}^{I},\mathrm{Dg}{f_{v}}^{I})\leq d_{B}(\mathrm{Dg}{f_{u}},\mathrm{Dg}{f_{v}})$ . Since $f_{u},f_{v}:|G|\to\mathbb{R}$ are two Morse type functions, $d_{B}(\mathrm{Dg}{f_{u}},\mathrm{Dg}{f_{v}})\leq||f_{u}-f_{v}||_{\infty}$ by the LZZ Stability Theorem of [10]. Furthermore, by the triangle inequality, for any $x\in|G|$ , $|d_{G}(x,u)-d_{G}(x,v)|\leq d_{G}(u,v)$ , meaning that $||f_{u}-f_{v}||_{\infty}\leq d_{G}(u,v)$ . Putting everything together proves the claim. ∎

Another application of Corollary 4 is as follows. Define $\Phi:|G|\rightarrow SpDg$ , $\Phi(v)=\mathrm{Dg}{f_{v}},$ where $SpDg$ denotes the space of persistence diagrams. Given metric graphs $(G_{1},d_{G_{1}})$ and $(G_{2},d_{G_{2}})$ , their persistence distortion distance [17] is

[TABLE]

where $d_{H}$ denotes the Hausdorff distance. In other words,

[TABLE]

Note that the diagram $\mathrm{Dg}{f_{v}}$ contains both [math]- and $1$ -dimensional persistence points, but only points of the same dimension are matched under the bottleneck distance. A local version of the persistence distortion distance, which we will denote by $d_{PD}^{r}$ , may be defined as follows: for each base point $v$ , only consider the distance function to points within a fixed intrinsic radius $r$ .

Corollary 6.

If $r\leq r^{\prime}$ , then $d_{PD}^{r}(G_{1},G_{2})\leq d_{PD}^{r^{\prime}}(G_{1},G_{2}).$

Proof.

Let $D_{1}^{r}$ be the persistence diagram for some base point $v\in|G_{1}|$ , where the geodesic distance function is computed in the interval $[0,r]$ . Let $D_{1}^{r^{\prime}}$ be the persistence diagram for the same base point, but where the distance function is computed in the interval $[0,r^{\prime}]$ . Define $D_{2}^{r}$ and $D_{2}^{r^{\prime}}$ similarly for some base point in $|G_{2}|$ . By viewing $D_{i}^{r}$ as a restriction of $D_{i}^{r^{\prime}}$ for $i=1,2$ , we can apply Theorem 2 to show that $d_{B}(D_{1}^{r},D_{2}^{r})\leq d_{B}(D_{1}^{r^{\prime}},D_{2}^{r^{\prime}})$ . Since our choice of base points was arbitrary, this inequality holds for persistence diagrams across all choices of base points in $|G_{1}|$ and $|G_{2}|$ . Therefore, using the definition of the local version of the persistence distortion distance, we can conclude that $d_{PD}^{r}(G_{1},G_{2})\leq d_{PD}^{r^{\prime}}(G_{1},G_{2}).$ ∎

We end with a final remark on how Theorem 2 can be applied to a $d$ -parameter persistence module on any topological space (not restricted to the level set or metric graph settings). A $d$ -parameter persistence module is indexed by a $d$ -dimensional family of vector spaces, $\{{{\mathbb{V}}}_{u}\}_{u\in{{\mathbb{R}}}^{d}}$ , together with a family of linear maps $\{\varphi_{{\mathbb{V}}}(u,v):{{\mathbb{V}}}_{u}\to{{\mathbb{V}}}_{v}\}_{u\preceq v}$ such that for $u\preceq v\preceq w\in{{\mathbb{R}}}^{d}$ , we have $\varphi_{{\mathbb{V}}}(u,u)={\mathrm{id}}_{{{\mathbb{V}}}_{u}}$ and $\varphi_{{\mathbb{V}}}(u,w)\circ\varphi_{{\mathbb{V}}}(u,v)=\varphi_{{\mathbb{V}}}(u,w)$ [11]. Here, $u\preceq v$ if and only if $u_{i}\leq v_{i}$ for $i=1,\ldots,d$ . Any line $L$ in the set of all lines of $\mathbb{R}^{d}$ with direction $\textbf{m}=(m_{1},\ldots,m_{d})$ such that $\displaystyle\min_{i}m_{i}$ is strictly positive gives a one-parameter slice of the $d$ -parameter persistence module. Given two $d$ -parameter persistence modules X and Y, we define their matching distance [21] to be

[TABLE]

where $\mathrm{Dg}{{\textbf{X}}}_{L}$ and $\mathrm{Dg}{{\textbf{Y}}}_{L}$ are the persistence diagrams of the $d$ -parameter persistence modules X and Y restricted along line $L$ . Our result extends naturally to this linear relationship between these two parameters. Indeed, if we restrict both $d$ -parameter persistence modules to a region ${{\mathbb{I}}}=I_{1}\times\cdots\times I_{d}$ , where each $I_{i}$ is an interval of the real line, then Theorem 2 implies the following corollary.

Corollary 7.

[TABLE]

where $d_{match}({{\textbf{X}}}^{{\mathbb{I}}},{{\textbf{Y}}}^{{\mathbb{I}}})$ is computed by restricting $\mathrm{Dg}{{{\textbf{X}}}_{L}}$ and $\mathrm{Dg}{{{\textbf{Y}}}_{L}}$ to the subinterval of the line $L$ passing through the region ${{\mathbb{I}}}$ .

Proof.

For a fixed line $L$ with direction m, consider a region ${{\mathbb{I}}}$ restricted to $L$ , denoted $I_{L}\subset{{\mathbb{I}}}\cap L$ . Recall that $\mathrm{Dg}{{\textbf{X}}}_{L}$ and $\mathrm{Dg}{{\textbf{Y}}}_{L}$ are the persistence diagrams of the $d$ -parameter persistence modules X and Y restricted along the line $L$ . Based on Theorem 2,

[TABLE]

From the definition of supremum, we know that $\forall\epsilon>0$ , there is a line $L_{\epsilon}$ such that

[TABLE]

Using observation (2) above, we see that

[TABLE]

The right-hand side is, of course, less than the supremum over all lines $L$ , the definition of $d_{match}({{\textbf{X}}},{{\textbf{Y}}})$ . Hence, for every $\epsilon>0$ , we have $d_{match}({{\textbf{X}}}^{{\mathbb{I}}},{{\textbf{Y}}}^{{\mathbb{I}}})-\epsilon<d_{match}({{\textbf{X}}},{{\textbf{Y}}})$ ; in other words, $d_{match}({{\textbf{X}}}^{{\mathbb{I}}},{{\textbf{Y}}}^{{\mathbb{I}}})\leq d_{match}({{\textbf{X}}},{{\textbf{Y}}})$ , as desired. ∎

5 Discussion

Theorem 2 and its corollaries provide explicit relationships between distances computed locally and globally, and the resulting inequalities are very broadly applicable. For instance, the fact that the local bottleneck distance is bounded above by the global bottleneck distance allows for a single global computation to potentially rule out local differences if the global distance is low. If looking for local differences, starting with a global computation may save computational time if there are too many local comparisons to make. On the other hand, the global bottleneck distance being bounded below by the local version allows smaller computations to approach the global truth, while perhaps being more computationally tractable.

In future work, we would like to extend these ideas to generalized persistence, where instead of a linear sequence of topological spaces one considers topological spaces and transformations that form a poset. In contrast to zigzag persistence, this generalized persistence does not have the notion of a persistence diagram. Instead, we would need to restate our results in terms of the interleaving distance between persistence modules. Moreover, a notion of “local” would have to be defined in the poset setting.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Aanjaneya, F. Chazal, D. Chen, M. Glisse, L. Guibas, and D. Morozov. Metric graph reconstruction from noisy data. International Journal of Computational Geometry & Applications , 22(04):305–325, 2012.
2[2] M. Ahmed, B. T. Fasy, and C. Wenk. Local persistent homology based distance between maps. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , SIGSPATIAL ’14, pages 43–52, New York, NY, USA, 2014. ACM.
3[3] P. Bendich, D. Cohen-Steiner, H. Edelsbrunner, J. Harer, and D. Morozov. Inferring local homology from sampled stratified spaces. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07) , pages 536–546, Oct 2007.
4[4] P. Bendich, E. Gasparovic, J. Harer, R. Izmailov, and L. Ness. Multi-scale local shape analysis and feature selection in machine learning applications. In 2015 International Joint Conference on Neural Networks (IJCNN) , pages 1–8, July 2015.
5[5] P. Bendich, B. Wang, and S. Mukherjee. Local homology transfer and stratification learning. ACM-SIAM Symposium on Discrete Algorithms , pages 1355–1370, 2012.
6[6] M. B. Botnan and M. Lesnick. Algebraic stability of zigzag persistence modules. Algebraic & Geometric Topology , 18(6):3133–3204, 2018.
7[7] P. Bubenik, V. de Silva, and J. Scott. Metrics for generalized persistence modules. Foundations of Computational Mathematics , 15(6):1501–1531, 2015.
8[8] P. Bubenik and T. Vergili. Topological spaces of persistence modules and their properties. Ar Xiv:1802.08117, 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Local Versus Global Distances for Zigzag Persistence Modules

Abstract

1 Introduction

Persistence modules and zigzag persistence

Global versus local perspectives on persistence

Our contributions

2 Brief Background and Definitions

Definition 1**.**

3 Bottleneck Distance in the Local vs. Global Settings

Theorem 2**.**

Proof.

Lemma 3**.**

Proof.

Corollary 4**.**

4 Applications to Metric Graphs and ddd-Parameter Persistence

Corollary 5**.**

Proof.

Corollary 6**.**

Proof.

Corollary 7**.**

Proof.

5 Discussion

Definition 1.

Theorem 2.

Lemma 3.

Corollary 4.

4 Applications to Metric Graphs and $d$ -Parameter Persistence

Corollary 5.

Corollary 6.

Corollary 7.