Metrics and stabilization in one parameter persistence
Wojciech Chach\'olski, Henri Riihim\"aki

TL;DR
This paper introduces a new perspective on one-parameter persistence by emphasizing the importance of metric choices, leading to stabilized invariants and practical data analysis applications.
Contribution
It proposes a metric-based framework for one-parameter persistence, focusing on stabilization of invariants rather than decomposition theorems, with theoretical development and empirical evidence.
Findings
Stabilization of discrete invariants via pseudometrics
Development of stable rank invariant theory
Evidence of practical usefulness in data analysis
Abstract
We propose a new way of thinking about one parameter persistence. We believe topological persistence is fundamentally not about decomposition theorems but a central role is played by a choice of metrics. Choosing a pseudometric between persistent vector spaces leads to stabilization of discrete invariants. We develop theory behind this stabilization and stable rank invariant. We give evidence of the usefulness of this approach in concrete data analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Metrics and stabilization in one parameter persistence††thanks: First author was funded by VR and Göran Gustafsson foundation. Second author was partly supported by a collaboration agreement between the University of Aberdeen and EPFL.
Wojciech Chachólski Mathematics Department, KTH, S-10044 Stockholm, Sweden. ([email protected]).
Henri Riihimäki Mathematics Department, University of Aberdeen, Aberdeen AB243UE, Scotland, UK. ([email protected]).
Abstract
We propose the use of persistent homology in a supervised way. We believe homological persistence is fundamentally not about decomposition theorems but a central role is played by a choice of metrics. Choosing a pseudometric between persistent vector spaces leads to a model. Fitting this model is what we believe supervised homological persistence is. We develop theory behind constructing such models and we give evidence of the usefulness of this approach in concrete data analysis tasks.
1 Sense of geometry
During the last decade there has been a considerable increase in research focused on persistent homology. This has been fueled by an explosion of applications ranging from neuroscience [12], to vehicle tracking [2], and the characterization of nanomaterials [13], testifying to usefulness of homology to understand spaces described by measurements and samplings. All these applications of persistent homology have been in principle exploratory in nature with some elements of learning based on persistent diagrams. In fact, persistent homology can be regarded as a generalization to higher homologies of clustering methods ( persistence) that have been the core of exploratory data analysis for a long time. Although exploratory tools are important, the main research front in modern data science has shifted from exploratory to supervised learning, due to even more spectacular applications of machine learning methods.
Our aim in this paper is to explain how to use persistent homology in a supervised way, allowing to optimize over various models for the observed homological information. The focus is on studying the space of stable translations from homological information into information that can be analysed through more basic operations such as counting and integration enabling the use of statistical tools to its outcomes. We show how a pseudometric on the set of tame -parametrized vector spaces (see Section 3), which is a natural place where homological invariants of data live, leads to a stable translation. Every such pseudometric gives therefore a model for extracting information from persistent homology. Fitting this model to the training data is what in our approach persistence based supervised learning is. In Section 10 of the paper we illustrate that this strategy can indeed lead to improvements in classification tasks. We consider two such tasks: distinguishing between random point processes on a unit square generated according to different distributions, and distinguishing between activities of ascending and descending stairs of 7 people based on the activity monitoring PAMAP2 data obtained from [14]. By choosing a different model, in the first case the overall averaged accuracy improves from to and in the second case from to . Our goal for this paper however is not to benchmark our approach. Our goal is to present the proof of concept of the key ideas, explain their mathematical background, and indicate that they can lead to improvements in analysing data. Discussing the effectiveness of this approach is planned for a sequel to this article.
Our models are built using a process called hierarchical stabilization of the rank which assures the necessary stability requirement. It builds on work presented in [6], [7], and [17]. The input to the process is a pseudometric on the set . The output is a Lipschitz-continuous function where is the space of Lebesgue measurable functions in which probability and statistical methods are well developed. We think about this function as the model associated to the pseudometric . In this framework (supervised) persistence analysis is about identifying these pseudometrics for which structural properties of the (training) data are reflected by the geometry of its image in through the function . The strategy of looking for appropriate pseudometrics can only work if we are able to parametrize explicitly a rich subspace of pseudometrics on . Such rich parametrizations would enable the use of for example stochastic gradient descent techniques to search through the parameters for suitable pseudometrics, which we intend to explore in the mentioned sequel to this paper. This article builds on our discovery that such parametrizations are indeed possible using Lebesgue measurable functions with positive values referred to as densities (see 5.4 and 5.5).
Parametrizing models for persistence analysis by pseudometrics should not be surprising. Discovering appropriate metrics and units to measure physical phenomena is essential in understanding these phenomena. Comparison and interpretation of observations should depend on the phenomena and the experiments they came from and not simply just on their values. Different phenomena might require different comparison metrics. We should not restrict ourselves to only bottleneck or Wasserstein distances to compare outcomes of persistence analysis of diverse data sets obtained from a variety of different experiments. We should be able to choose metrics that fit particular experiments. Our goal is to present mathematical foundations of how to do it for outcomes of persistent homology. This fits also well with many recent studies ([3], [8], [19], [18], [20]) which challenge the traditional view in persistence that bars with long lifespans are of importance and smaller bars are to be considered as noise. These studies show that also shorter bars and their appearance in the filtration might carry important information. For example in [8] the authors find that observed diffraction peaks of amorphous silica glass relate to small scale loops in the atomic configurations. It thus depends on the analysis at hand what is to be taken as noise and we advocate that emphasizing the meaningful features is a question of choosing an appropriate metric on .
2 Hierarchical stabilization
All the proofs of the propositions presented in this section are placed in Appendix A.
A discrete invariant is a function with values in the set of natural numbers . We think about as a collection of data sets or objects that represent them. If consists of finite metric spaces for example, such an invariant might assign to a metric space in the number of clusters obtained by applying some clustering algorithm. Our method of converting such a discrete invariant into a stable one, which we call hierarchical stabilization, requires a choice of a pseudometric on and this is the key step central to persistence analysis in our approach. Recall that a pseudometric is a function satisfying reflexivity , symmetry and triangular inequality for any , , and in . Here denotes the extended set of non-negative real numbers including with the standard arithmetic and order relation. Its subset of real numbers is denoted by .
Once a pseudometric on is chosen, for in , we define to be the function given by the formula:
[TABLE]
The number is the minimum among all the values takes on the disk around with radius with respect to the pseudometric . This function is non-increasing with values in natural numbers and hence Lebesgue measurable. Furthermore there is such that, for in , the equality holds. This value is called the limit of and is denoted by . Recall that denotes the set of Lebesgue measurable functions .
Definition 2.1**.**
Hierarchical stabilization of with respect to a pseudometric on is the function that maps in to .
The range of has a much richer geometry than the set of natural numbers, the range of . For example, has many interesting pseudometrics, among them the standard -metric (for ) and a so called interleaving metric :
[TABLE]
The hierarchical stabilization satisfies the following Lipschitz properties:
Proposition 2.1**.**
Let be a pseudometric on , a function, and a real number. Then, for and in :
** 2. 2.
* where *
We think about the hierarchical stabilization as a process of converting a discrete invariant into a stable invariant whose values are in a space in which rich probability and statistical methods are well developed. We take advantage of this in our examples in Sections 10.2 and 10.3. Different pseudometrics on lead to different invariants. In our framework persistence analysis is about identifying pseudometrics on for which the associated invariants reflect structural properties of . The expectation is that some of these properties should be reflected by the geometry of the image of in described by the or interleaving metrics if an appropriate pseudometric on is chosen. We refer to the function also as the hierarchical stabilization model of associated to the invariant and the pseudometric .
In general there is a loss of information as may map objects that we do not intend to identify to the same function. For retaining more information we are going to consider families of pseudometrics on and the induced stabilizations. Let denote the set of measurable functions of the form .
Definition 2.2**.**
Let be a function and a sequence of pseudometrics on indexed by .
The sequence is called non-decreasing if for all in and , in . 2. 2.
For in , is a function defined as follows:
[TABLE] 3. 3.
The sequence is called admissible for if is Lebesgue measurable for all in . 4. 4.
Assume is admissible for . Then the function , mapping in to , is called the hierarchical stabilization of along the sequence .
Non-decreasing sequences are key examples of universally admissible sequences:
Proposition 2.2**.**
A non-decreasing sequence of pseudometrics on is admissible for any .
Similarly to , we consider following pseudometrics on , the normalized (for ) and the interleaving metrics:
[TABLE]
The key parameter in the hierarchical stabilization is the choice of a pseudometric. It turns out that with respect to this key parameter the hierarchical stabilization along a sequence of pseudometrics is also stable. Here is one manifestation of this stability:
Proposition 2.3**.**
Let be a function. Assume is a non-decreasing sequence of pseudometrics on . Let be the hierarchical stabilization of along this sequence. Then, for and in :
** 2. 2.
* where *
The hierarchical stabilization process along a non-decreasing sequence of pseudometrics on converts a discrete invariant into a stable invariant . We can now state our key definition:
Definition 2.3**.**
Let be a category with its set of objects also denoted by . A sequence of pseudometrics on is called ample for if it is admissible for , and the hierarchical stabilization along this sequence has the following property: and in are isomorphic if and only if .
Let be a category. By definition ample for sequences of pseudometrics on lead to stable embeddings of isomorphism classes of objects in into . By choosing such an embedding, we can think about as a subspace of in which rich probability and statistical methods are well developed. Different sequences which are ample for give different embeddings. Our expectation is that by choosing an appropriate such embedding, structural properties of relevant to a data analysis task could be reflected by the geometry of the image of in described by the or interleaving metrics. The focus of this article is on being the category of tame -parametrized vector spaces (also called one parameter tame persistence modules) and being the minimal number of generators or equivalently the number of bars in the bar decomposition.
3 Formal homological persistence
A typical input for persistent homology is a collection of finite pseudometric spaces. Since homological persistence is about looking for homological features, each element of such data needs to be transformed into an object reflecting these features more directly. The first step in this transformation is to convert the metric information into a simplicial complex parametrized by the poset whose elements are referred to as scales in this context. In this paper the Vietoris-Rips construction is used for that purpose, which at a scale is a simplicial complex whose -simplices are subsets of consisting of points which are pairwise at most distance from each other. For example, two elements and in connect to an edge, or a 1-simplex, when . If , then . The obtained filtration indexed by the poset is denoted by the symbol .
Note that does not add or forget any information about and hence is as complicated as the metric space itself. Simplification is therefore necessary and this is the purpose of the second step in this transformation in which the -th homology (with coefficients in a chosen field ) is applied to the simplicial complexes in the Vietoris-Rips filtration. This results in a functor given by the linear functions induced by the inclusions for in . A functor of the form is also called a -parametrized vector space and the linear function , for in , is called a transition function.
The functor is not an arbitrary -parametrized vector space. It satisfies additional two properties which follow from finiteness of :
Definition 3.1**.**
An -parametrized vector space is called tame if:
the vector space is finite dimensional for every in , 2. 2.
there are finitely many in such that, for in , a transition function may fail to be an isomorphism only if for some .
For example constant functors are tame, in particular the [math] functor. If is finite, then, for any pseudometric on , the functor is also tame. The symbol denotes the collection of tame -parametrized vector spaces. We refer to the process of assigning to a finite metric space a tame -parametrized vector space as formal persistence , the prominent example being the homology of the Vietoris-Rips construction .
4 The rank
We recall how to define, calculate, and interpret the rank of a tame -parametrized vector space. These are standard known results, which are included since the rank is the key discrete invariant studied in this paper. The rank is the number of bars in a bar decomposition. However to calculate the rank one does not need a bar decomposition. Calculating the rank is a much easier task than describing a bar decomposition.
4.1 Rank of a parametrized vector space
Let be in . Choose in such that can fail to be an isomorphism only if for some . Set:
[TABLE]
The vector space is finite dimensional and does not depend on the choice of the sequence . Define:
[TABLE]
If and are tame -parametrized vector spaces, then their direct sum is also tame and is isomorphic to . In particular . Furthermore if and only if .
4.2 Maps of parametrized vector spaces
A map or a natural transformation between two -parametrized vector spaces and , denoted by , is a sequence of linear maps for which the following diagram commutes for every in :
[TABLE]
The set of natural transformations between and is denoted by . Tame -parametrized vector spaces together with maps between them and the composition given by the parameter-wise composition is a category which is denoted also by . This is the only category structure we consider on the set . In this category is an epimorphism, a monomorphism or an isomorphism if and only if, for every , the linear function is, respectively, an epimorphism, a monomorphism or an isomorphism of vector spaces.
4.3 Bars
Let be in ( for start and for end). Note that might be equal to . Define to be the -parametrized vector space given by:
[TABLE]
We call the bar starting in and ending in . If , then is called finite. Note that is tame and .
Let be in and be a -parametrized vector space. The function , assigning to a map the element in , is a bijection. Thus every element in yields a unique map denoted by the same symbol for which . Similarly a set of elements yields a unique map . Its image is denoted by and called the -parametrized vector subspace of generated by . It is the smallest subspace of containing all the ’s. If , then the set is said to generate .
Assume . Then the function , assigning to a map the element in is an inclusion. Its image coincides with . Thus every element in yields a unique for which . This map is also denoted by the symbol .
4.4 Monotonicity of the rank
Let be in . For any choice of a finite set , the subspace is tame. Furthermore . We refer to this property as the monotonicity of the rank.
4.5 Rank and the number of generators
Let be a map between tame -parametrized vector spaces. Let be in such that or can fail to be an isomorphism only if for some . For every , there is a unique linear map making the following diagram commutative:
[TABLE]
Define to be . Again the map does not depend on the choice of the sequence .
It turns out that is an epimorphism if and only if is surjective. This is the key observation that can be used to show that ** coincides with the smallest number of elements generating **. In particular tame -parametrized vector spaces are finitely generated.
4.6 Ends of elements
Let be a tame -parametrized vector space. For in , consider and define the end of to be . Note that either or in which case tameness implies . The induced map is a monomorphism.
4.7 Rank and the number of bars
Let be a tame -parametrized vector space. An element in is defined to generate a bar in , if has a retraction, i.e., a map for which the following composition is the identity:
[TABLE]
In this case is a direct summand of .
A sequence of elements is called a **sequence of bar generators ** for if the induced map is an isomorphism. The fundamental structure theorem states that every tame -parametrized vector space admits a sequence of bar generators. In particular every tame -parametrized vector space is isomorphic to a direct sum of bars.
This structure theorem can be proven by induction on the rank. If , then the empty sequence is a sequence of bar generators. Let . Assume the statement is true if the rank is smaller than . Let be a set of generators of . Set . Among choose for which is the largest. We claim that generates a bar. This implies that is isomorphic to . Thus and by induction admits a sequence of bar generators.
The discrete invariant we focus on in this paper is the rank or equivalently the minimal number of generators, or the number of bar generators:
[TABLE]
Our aim is to study its hierarchical stabilizations as explained in Section 2. For that we need to produce pseudometrics on . Noise systems in [17] were introduced exactly for this purpose. For implementing on a computer so called simple noise systems [7, Definition 8.2] are more convenient. The reason is that simple noise systems are parametrized by contours [7, Theorem 9.6]. Instead of explaining the theory behind noise systems, we focus in this article on discussing only contours and how they can directly be used to define pseudometrics on . Contours are also effective in calculating induced hierarchical stabilizations of the rank. We believe however that it is important to be aware of the relation between contours and noise systems.
5 Contours
Definition 5.1**.**
A contour is a function satisfying the following inequalities for all and in and and in :
if and , then ; 2. 2.
; 3. 3.
.
Let be a contour. is an action if and for all in and and in . is closed if the set is closed for all in . is regular if the following conditions are satisfied:
- •
is a monomorphism for every in ,
- •
is a monomorphism whose image is for every in .
The first condition of 5.1 makes sure that a contour preserves the poset structures. The second and third one can be depicted graphically as:
[TABLE]
If is a regular contour, then for in :
[TABLE]
Since all the sets on the right above are closed, regular contours are therefore closed. For contours to be useful as tools in data analysis we need methods to produce them. We now present several of them along with examples.
Definition 5.1 gives three functional inequalities implicitly characterizing contours. The last inequality however makes it difficult to give explicit formulas. We can in any case make initial guesses for the form of a contour and then try to find a formula satisfying the requirements of Definition 5.1.
5.2 Exponential contours
Let be a non-decreasing function such that . For in , define . Then satisfies the first two inequalities of 5.1. The third inequality is equivalent to:
[TABLE]
For instance, since , the function associated with the exponential function is a contour. In fact we could choose any positive base number other than . Such contours are called exponential. Exponential contours are actions.
5.3 Standard contour
Let be a non-decreasing function. For in , define Then satisfies the first two inequalities of 5.1. The third inequality is equivalent to . Thus for to be a contour, should be superlinear: . For example is a contour called the standard contour. The standard contour is an action which is regular. Another example is the parabolic contour . The parabolic contour is not an action, however it is regular. In fact all contours of the form are regular if is superlinear and strictly increasing.
Contours can also be described by integral equations. In Sections 5.4 and 5.5 we consider a Lebesgue measurable function with strictly positive values referred to as a density. In Section 10 we illustrate visualizations of some densities and the associated contours of the following two types.
5.4 Contours of distance type
Since has strictly positive values, for in , there is a unique in for which:
[TABLE]
Additivity of integrals gives for and in :
[TABLE]
implying . The inequality , for and , is a consequence of the monotonicity of integrals. If in addition we set , then the obtained function is a contour, even an action. It is called of distance type as it describes the distance needed to move from to the right in order for the area under the graph of to reach . Distance type contours are regular. If density is the constant function , then and thus is the standard contour (see 5.3).
5.5 Contours of shift type
For in , there is a unique in such that . Define:
[TABLE]
Monotonicity of integrals implies that satisfies the first two inequalities of Definition 5.1. Since and , by definition:
[TABLE]
The function is therefore a contour which is an action. By writing for we see is a translation of by the -step integral of the density. Therefore it is called of shift type. Shift type contours are regular. If the density is the constant function , then and hence is the standard contour.
5.6 Truncating contours
Let be a contour. Choose an element in . For in define:
[TABLE]
For example and . We claim that is a contour. The first two inequalities of Definition 5.1 are clear. It remains to show:
[TABLE]
The inequality is clear if . Assume . This implies that also and . Consequently (C/\alpha)\big{(}(C/\alpha)(a,\epsilon),\tau\big{)}\allowbreak=C(C(a,\epsilon),\tau) and and hence in this case the inequality follows from the fact that is a contour.
The contour is called the truncation of at . If is closed, then so is its truncation for in . If in , then for in :
[TABLE]
5.7 Almost a contour
Let be a contour. Choose an element in . For in define:
[TABLE]
Is the function a contour? The second inequality of Definition 5.1 is clear. The third inequality (C/\!\!/\alpha)\big{(}(C/\!\!/\alpha)(a,\epsilon),\tau\big{)}\leq(C/\!\!/\alpha)(a,\epsilon+\tau) is clear if or and . Assume and . This implies:
[TABLE]
Thus (C/\!\!/\alpha)\big{(}(C/\!\!/\alpha)(a,\epsilon),\tau\big{)}=C(C(a,\epsilon),\tau) and . The desired inequality follows from the fact that is a contour.
If , then since , the inequality holds for any . Thus the function satisfies almost all of the requirements of the Definition 5.1 except possibly for the preservation of the poset relation in the first variable. This last requirement can in fact fail to be satisfied. For example consider the distance contour with respect to the density given in Figure 7. In this case and . Thus and .
It turns out that all the results in this article regarding contours until Theorem 7.1 do not require the assumption of the preservation of the poset relation in the first variable. Exploring generalizations of contours to functions that do not preserve order in the first variable is a part of our carrent research.
Contours appeared independently in [5] under the name superlinear families where they were used to define interleaving distances between generalized persistence modules. Since then generalized persistence modules have gathered some interest, see [11] and [15]. In this section we presented a variety of ways of constructing contours greatly enlarging [5], in which only the standard contour is given as a concrete example of a superlinear family. The aim of the next section is to show how contours lead to pseudometrics on .
6 Constructing pseudometrics from contours
In this section we explain how to use a contour to define a pseudometric on . The initial idea developed together with Oliver Gäfvert and some of the text and diagrams below were written by him. It is also planned for some of this material to be a part of Oliver’s future work. We refer to the thesis work [6] and paper [7] for relevant background that lead us together with Oliver to discover this connection between contours and pseudometrics.
To estimate how far apart tame parametrized vector spaces are from each other with respect to a contour we use the notion of equivalences:
Definition 6.1**.**
Let be a contour, and be tame -parametrized vector spaces, and be in .
A map is called an -equivalence (with respect to ) if, for every in such that , there is a linear function making the following diagram commutative:
[TABLE] 2. 2.
The objects and are called -equivalent (with respect to ) if there is a tame -parametrized vector space and maps such that is an -equivalence, is an -equivalence, and . 3. 3.
Let S:=\{\epsilon\in[0,\infty)\ |\ \text{VW\epsilon-equivalent}\}. Define:
[TABLE]
If , then every map is a [math]-equivalence, all tame -parametrized vector spaces and are [math]-equivalent, and . A monomorphism in is an -equivalence with respect to if and only if the image of is included in the image of for all in such that . In particular is an -equivalence if and only if is the zero function for all such . Furthermore is -equivalent to [math] if and only if is an -equivalence. Thus if and only if is the zero function for all in such that . It is however not true in general that if , then is the zero function for in such that . This depends if is closed or not.
According to Definition 6.1, to estimate and calculate , the objects and are compared through a third tame -parametrized vector space via a short zig-zag of equivalences . In principal to assure the triangular inequality and obtain a pseudometric one should compare and not via short but long zig-zags of equivalences. The main content of the next proposition is that in our case short zig-zags are sufficient.
Proposition 6.1**.**
* is a pseudometric on .*
To prove this proposition and explain why short zig-zags are sufficient we need:
Proposition 6.2**.**
Let be a contour and , , and be tame -parametrized vector spaces.
Composition of - and -equivalences is an -equivalence. 2. 2.
In the following pushout square, is also tame and, if is an -equivalence, then so is :
[TABLE]
Proof.
(1): Consider an -equivalence and a -equivalence . If , then and hence, for any such , there are linear functions making the following diagram commutative:
[TABLE]
The diagonal morphism in this diagram is a linear function whose existence is required for to be an -equivalence.
(2): Tameness of is clear. Assume . Let be a function given by the fact that is an -equivalence. This function fits into the following cube where the dotted arrow is the unique function making this cube commutative (its existence is guaranteed by the universal property of push-outs):
[TABLE]
∎
Proof of Proposition 6.1.
Symmetry is clear. For the triangle inequality consider -equivalent and , and -equivalent and and form the following diagram:
[TABLE]
where is an -equivalence, is an -equivalence, , is a -equivalence, is a -equivalence, , and the central square in this diagram is a push-out. According to 6.2.(2), is an -equivalence and is a -equivalence. Thus 6.2.(1) implies is a -equivalence and is a -equivalence. Since , we can conclude that and are -equivalent. The triangle inequality follows. ∎
We are interested in not just individual pseudometrics but also in their sequences, particularly the non-decreasing ones (see Definition 2.2). To produce such sequences of pseudometrics on we use:
Proposition 6.3**.**
Assume and are contours such that . Then:
An -equivalence with respect to implies -equivalence with respect to . 2. 2.
.
Proof.
Statement (2) is a direct consequence of (1). To show (1), let be an -equivalence with respect to . If , then also , and hence we can form the following commutative diagram whose diagonal is the function assuring is an -equivalence with respect to :
[TABLE]
∎
7 Stable ranks
We are now ready to discuss our models for supervised persistence:
Definition 7.1**.**
Let be a contour and be the associated pseudometric on (see Proposition 6.1). The hierarchical stabilization (see Section 2) of the rank function (see Section 4), with respect to , is called stable rank and is denoted by:
[TABLE]
By Definition 2.1, the stable rank assigns to a tame -parametrized vector space , the function defined as follows:
[TABLE]
Thus is non-increasing with natural numbers as values and therefore there are finitely many elements in its domain such that is constant on the open intervals ,…, ,…, . Depending on the contour, may fail to be right or left continuous.
The aim of this section is to provide effective ways of calculating the stable rank. If a contour is closed (see Definition 5.1), then the following fundamental properties of the stable rank explain how its values are related to a bar decomposition. One can then use for example the Ripser software [1] for effective calculations of the stable rank.
Theorem 7.1**.**
If is a closed contour, then satisfies the following properties:
The function is right continuous for any 2. 2.
The function is linear: . 3. 3.
.
According to the third statement of Theorem 7.1, the values of the stable rank are certain counts of bars in a bar decomposition. Note that in our entire set up of the hierarchical stabilization process and in the definition of the stable rank no bar decomposition is mentioned or used. The aim of the hierarchical stabilization is to convert discrete invariants into stable invariants by minimizing over discs. The stable rank therefore encodes in a stable way some information about how ranks of tame -parametrized vector spaces change in certain neighbourhoods. Achieving stability is the key objective of this process. The linearity property (Theorem 7.1.(2)) is then what connects the stable rank with bar decompositions. Traditionally, in persistent homology one considers bar decompositions first and then proves that with respect to certain metrics the associated persistence diagrams are stable. The reversal of this perspective, stability first and decompositions after, has been an important step in our approach to homological persistence. Stability is so fundamental for any method aimed at data analysis that we believe it should be the primary guiding principle in homological persistence. Decompositions can be then used as effective tools for calculating the constructed stable invariants. This change of perspective is vital to multiparameter generalizations of homological persistence where decomposition methods are not available (see [7]).
Theorem 7.1 is a direct consequence of Corollary 7.6 and Proposition 7.2. This strategy to prove Theorem 7.1 is taken from [7] (see [7, Section 8]) and is based on:
Definition 7.2**.**
Let be a contour, in , and in . The -shift of with respect to , denoted by , is the -parametrized vector subspace of generated by all the elements in the images of the transition functions for all in such that .
The shift operation enjoys the following properties:
Proposition 7.2**.**
Let be in , in , and be a contour.
If is generated by , then is generated by:
[TABLE] 2. 2.
* is tame.* 3. 3.
The inclusion is a -equivalence with respect to . 4. 4.
A monomorphism is a -equivalence if and only if is contained in the image of . 5. 5.
The shift is linear: and are isomorphic. 6. 6.
* is isomorphic to .* 7. 7.
.
Proof.
(1) is a direct consequence of the definitions; (2), (3), (4), (5), and (6) follow from (1), and (7) from (6). ∎
If in , then and therefore by the monotonicity of rank (see 4.4), . Thus the function is non-increasing and, similarly to the stable rank, there are finitely many elements in such that is constant on the open intervals ,…, ,…, .
Proposition 7.3**.**
Let be a tame -parametrized vector space. If is a closed contour, then is a right continuous function, i.e., there are finitely many in such that is constant on the left closed and right open intervals .
Proof.
It is enough to show that for any , there is for which . Let be such that may fail to be an isomorphism only if for some . Choose a sequence of generators of . Consider only these which lie in . Since is closed, there are for which both and are in one of the intervals ,…, . Consequently the transition functions , for all , are isomorphisms and and have the same rank. ∎
Here is the key relation between the stable rank and the shift operation:
Theorem 7.4**.**
Let be a contour and be in . Then for in :
[TABLE]
Proof.
Since is a -equivalence (see Proposition 7.2.(3)), . This gives the first inequality.
Let be a tame -parametrized vector space for which and . Since , by definition there are maps in where is -equivalence, is -equivalence, and . For any in such that , consider the following commutative diagram where the vertical arrows are the transition functions and is the lift given by the fact that is -equivalence:
[TABLE]
Let and be a minimal set of generators of . Set . For any in the set , define and . We claim that the following inclusions hold:
[TABLE]
which imply , proving the second inequality. The first inclusion is a consequence of . The last inclusion is by definition. It remains to show the middle inclusion .
For any in such that we have the following commutative diagram where all the horizontal arrows indicate the transition functions, vertical arrows are functions induced by and , and and are lifts guaranteed by the fact that is -equivalence and is -equivalence, respectively:
[TABLE]
Commutativity of this diagram implies that, for every such , the image of the transition function belongs to . Since is generated by these images, the inclusion holds. ∎
According to Theorem 7.4, the functions and agree for all but finitely many points:
Corollary 7.5**.**
Let be a contour and be in . Then there are in such that the functions and agree on the open intervals ,…, ,…, . In particular, for :
[TABLE]
Theorem 7.4 together with Proposition 7.3 gives:
Corollary 7.6**.**
Assume is a closed contour. Then for any in .
We finish this section with:
Corollary 7.7**.**
Assume is a closed contour such that for any in . Let be a tame -parametrized vector space. Then:
. 2. 2.
* if and only if .*
Proof.
Corollary 7.6 and Proposition 7.2.(1) imply . Since is non-increasing, the identity is equivalent to , which by statement (1) is equivalent to , proving (2). ∎
8 Life span
Let be in . Since , then, for a contour , the value of is either or [math]. As the function is non-increasing, there is in such that:
[TABLE]
We define and call this element in the life span of . If , then the value can be either or [math], depending on the contour. If is closed, then according to Theorem 7.1.(1), . This with the additivity property in Theorem 7.1.(3) gives:
Proposition 8.1**.**
If is a closed contour, then:
[TABLE]
According to Proposition 8.1, the stable rank, with respect to a closed contour, counts bars whose life span strictly exceeds . Here is how to calculate the life span for regular contours:
Proposition 8.2**.**
Let be in and be in . If is a regular contour (see Definition 5.1), then:
[TABLE]
Proof.
According to Theorem 7.1.(3):
[TABLE]
This together with the regularity of implies all the claimed equalities. ∎
Corollary 8.3**.**
If is a regular contour, then:
[TABLE]
9 Regular contours and ampleness
Let be a contour. For every in , we can take its truncation (see 5.6). In this way we get a sequence of contours indexed by such that for in :
[TABLE]
Each of these contours induces a pseudometric on as defined in 6.1. In this way, according to Proposition 6.1, we obtain a sequence of pseudometrics . Furthermore, for all in and in , Proposition 6.3.(2) gives the following inequalities:
[TABLE]
A contour therefore induces a non-decreasing sequence of pseudometrics on , leading to hierarchical stabilization (Definition 2.2.(4)):
[TABLE]
We are now ready to state and prove our key ampleness result (see Definition 2.3 and discussion after):
Theorem 9.1**.**
Consider the category . If is a regular contour, then the sequence of pseudometrics is ample for the rank function .
Proof.
Let and be tame -parametrized vector spaces. Assume, for every in , . We need to show and are isomorphic.
Since is regular, then it is closed and for all , and hence according to Corollaries 7.6 and 7.7:
[TABLE]
Thus and have the same rank. Assume is isomorphic to and is isomorphic to .
Step 1: Reduction to finite bars. According to Corollary 8.3:
[TABLE]
Thus and are isomorphic to, respectively:
[TABLE]
where for .
Choose in such that for all and , and define:
[TABLE]
Note that and are isomorphic if and only if and are isomorphic. Thus to prove the theorem it is enough to show and are isomorphic.
We claim that, for every in , . This follows from the assumption , the additivity of the stable rank (Theorem 7.1.(2)), and Proposition 8.2 which gives that for all :
[TABLE]
We reduced the theorem to the case when all the bars in the bar decompositions of and are finite.
Step 2: Induction on the rank. Assume is isomorphic to and is isomorphic to where . We are going to prove by induction on the rank that and are isomorphic. The statement is clear if , since in this case both and are isomorphic to [math]. Assume . Let and (see Section 8). Recall that according to Proposition 8.1, for in :
[TABLE]
It follows that . Let and . We claim that . If , then contradicting the assumption.
Since is regular, there is a unique such that . Thus both and contain the bar of the form in their bar decompositions. We can then split off this bar and proceed by induction. ∎
Corollary 9.2**.**
Tame -parametrized vector spaces and are isomorphic if and only if for all contours .
Let us summarize our methods of producing embeddings of isomorphism classes of tame -parametrized vector spaces into the space of measurable functions of the form . A density , which is a measurable function with strictly positive values, leads to two regular contours: the distance type (see 5.4) and the shift type (see 5.5). According to Theorem 9.1 each of these contours then leads to a sequence of pseudometrics on which is ample for the rank. In this way any density leads to embeddings illustrated in the following diagrams:
[TABLE]
If the density is , then the distance and the shift type contours coincide and so do the induced embeddings. For other densities the contours and the embeddings are different. For example consider the constant densities and , and the density displayed in Figure 7. In Figure 1 we illustrate the following functions:
[TABLE]
where is a tame -parametrized vector space given by the first homology with coefficients of the Vietoris-Rips construction on an IFS point process on a unit square as described in Section 10.2.
10 Using contours
In this section we illustrate how choosing a density and a contour can lead to improvements in classification results based on stable ranks. We emphasize that the focus is not on finding an optimal classifier for a specific case. The aim is to give concrete evidence to support our claim that the choice of metrics is fundamental in homological persistence and to explain how contours are used in concrete analysis. We also hope to convey that the presented theory leads to a practical TDA pipeline amenable particularly to machine learning. Further study of the efficacy of this pipeline and the choice of contours for particular tasks is the aim of ongoing research. Two case studies are considered, point processes on a unit square and real data from human activities. To generate bars needed for our calculations we used the Ripser software [1] with Vietoris-Rips filtration and homology with coefficients.
10.1 Visualizing bars and contours
The bars of a -parametrized vector space can be parametrized by the start and the life span with respect to the standard contour , . Bars can then be visualized in an -plot as vertical stems. We call this presentation stem plot. For a reader accustomed to barcodes, the stem plot contains exactly the same information but plotted vertically. The horizontal axis is the filtration value and the vertical axis is the bar length. Taking into account multiplicity of more than one bar having the same start value we extend the domain of the stem plot to , where is used to index bars with the same birth. However with real data this is needed basically only for the [math]th homology since in the standard Vietoris-Rips filtration all the points and hence all the [math]th homology classes are present at filtration value 0.
For a fixed , the relation in Theorem 7.1.(3) describes an area above the parametric curve in the -plane. Setting and applying the transformation , we get a curve . Such curves are typically called contour lines, hence the name contours.
Significance of stem plots comes when overlaid with contour lines. The right plot of Figure 2 illustrates a persistence stem plot along with contour lines of distance and shift contours for few values of (dashed curves). Stem plot comes from one realization of point processes of Section 10.2. Density function used to calculate contours is also shown in the left plot. With contour lines the vertical axis of a stem plot corresponds to in and the horizontal axis is the filtration value as explained above.
Stem plot and contour lines make it easy to understand visually Proposition 8.1: the value of the stable rank at is the number of those bars that exceed the contour line at . Thus on those regions where contour lines obtain low values, homological features are magnified and vice versa for larger values of contour lines. Stem plot can be an effective tool to gain understanding of stable ranks with respect to different contours and to explore appropriate ones for a given task.
10.2 Point processes
Point processes have gathered interest in TDA community, see for example [4, 9, 16]. We simulated six different classes of point processes on a unit square, see their descriptions below. For each class we produced 500 simulations on average containing 200 points. Let denote that random variable follows probability disribution with parameter . In particular, denotes the Poisson distribution with event rate .
Poisson: We first sampled number of events, where . We then sampled points from a uniform distribution defined on the unit square . Here .
Normal: Again number of events was sampled from , . We then created coordinate pairs , where both and are sampled from normal distribution with mean and standard deviation . Here and .
Matern: Poisson process as above was simulated with event rate . Obtained points represent parent points, or cluster centers, on the unit square. For each parent, number of child points was sampled from . A disk of radius centered on each parent point was defined. Then, for each parent, the corresponding number of child points were placed on the disk. Child points were distributed by a uniform distributions on the disks. Note that parent points are not part of the actual data set. We set =40, =5, and .
Thomas: Thomas process is similar to Matern process except that instead of uniform distributions, child points are sampled from bivariate normal distributions defined on the disks. The distributions were centered on the parents and had diagonal covariance \bigl{[}\begin{smallmatrix}0.1^{2}&0\\ 0&0.1^{2}\end{smallmatrix}\bigr{]}.
Baddeley-Silverman: For this process the unit square was divided into equal size tiles with side lengths . Then for each tile, points were sampled, . Baddeley-Silverman distribution is a discrete distribution defined on values with probabilities . For each tile, associated number of points were then uniformly distributed on the tile.
Iterated function system (IFS): We also generated point sets with an iterated function system. For this a discrete distribution is defined on values with corresponding probabilities . We denote this distribution by IFS. Starting from an initial point on the unit square, new points were generated by the recursive formula where , and the functions are given as
[TABLE]
[TABLE]
Figure 3 shows realizations of the point processes with given parameters. From topological data analysis point of view the point sets hold no distinct large scale topology. It is therefore ideal to study the geometric correlations or features in the filtration captured by homologies in degrees 0 and 1, denoted and respectively.
Figure 4 shows stable ranks from distance and shift contours for one realization of the point processes. Corresponding stem plot and contour lines are shown in Figure 2. Note the different character of stable ranks between contours. Distance contour decreases lifespans of bars relative to it making the stable ranks decrease to zero faster and also diminishing their separation (Figure 4 left). Comparing, for example, Poisson and Baddeley-Silverman point processes in Figure 3, Poisson seems to have larger features appearing at larger filtration values. The point structure of Baddeley-Silverman seems to indicate that it has smaller features at smaller filtration values. Shift contour increases lifespans of those smaller Baddeley-Silverman bars around filtration value (right plot in Figure 2) whereas more of the later Poisson bars are shortened by the contour after . This can be seen in Baddeley-Silverman dominating Poisson stable rank (Figure 4 right). Note that the horizontal axes in Figure 4 correspond to the variable of contour while the horizontal axes of stem plots are the filtration values .
Explanation in the previous paragraph exemplifies how the choice of metric allows analyst to emphasize differently homological features in persistence analysis. As referenced in Section 1, various recent applications have shown that bars of different sizes and also their locations in the filtration might be deemed important for a given analysis task. The framework of hierarchical stabilization facilitates this kind of exploration. For instance, consider set of samplings of some dynamical phenomenon. Analysis with contours might help understanding whether observed smaller features are just sampling noise or indicate actual puncturing of the underlying topology of the dynamics. Quantifying importance of holes is also interesting in relational databases where they indicate missing data values or non-allowed attribute combinations [10].
Figure 5 is a plot of the averages (point-wise means) of stable ranks with respect to the standard contour and distance and shift contours of Figure 2 for 200 simulations of the point processes. Shift contour increases the separation between the stable ranks as compared to the standard contour, whereas distance contour decreases the separation. It is worth noting that Matern and Thomas processes are well distinguished by the shift contour in the right plot even though in their definition they only differ in the distribution used for point clusters.
To test how well the stable ranks with respect to different contours perform in classifying different point processes we conducted mean classification procedure:
- •
For each class choose 200 simulations as a training set. Remaining 300 simulations form test set for the class.
- •
Compute the point-wise means of the training set stable ranks with respect to the chosen contour. These mean invariants are used as classifiers, denoted by , where refers to the corresponding homology.
- •
Denote stable ranks in the test set by . Compute distances between each test element and all classifiers.
- •
Record found minimum distance by adding 1 to the corresponding pair of the classifier and the test class. Classification is successful if the classifier and the test belong to the same class (in the optimal case the value of the pair (Poisson , Poisson ) would be 300, for example).
- •
For cross-validation use 20-fold random subsampling. Randomly sample 200 stable ranks for classifiers, remaining 300 stable ranks in each class constitute the test sets. Repeat the classification procedure above 20 times and take the classification accuracy to be the average over the folds.
Cross-validated classification accuracies with standard contour are reported in the confusion matrices of Figure 6. The confusion matrices show relative accuracies after dividing by 300 after each fold and averaging after the full cross-validation run. The mean classification accuracy by taking the average over classes (average of the diagonal) is 85% for and 73% for . The classification procedure performs comparably or better as the hypothesis testing against the homogeneous Poisson process in [4]. Note that no other assumptions or parameter selections were involved in our methodology other than the split between training and test samples (200 and 300, respectively.)
Figure 8 shows cross-validated classification accuracies for stable ranks with shift contour described in Figure 7. We thus increase the lifespans of features appearing in the middle of the filtration. The overall classification accuracy increased to 78%. Particularly classification accuracy of the Thomas process was drastically improved as shown in the confusion matrix of Figure 8. Also noteworthy is the improvement in the accuracy of normal and Poisson processes. Using the shift contour thus captures more relevant distinguishing homological information of the point processes compared to the standard contour.
10.3 Activity monitoring
As an application to real data we studied activity monitoring of different physical activities. Used data set was PAMAP2 data obtainable from [14]. It makes sense to use all the persistence information, i.e. to combine homologies of different degrees into single classification scheme. In this section we demonstrate how this is enabled by stable ranks and our pipeline.
The data consisted of seven persons from the PAMAP2 data set performing different activities such as walking, cycling, vacuuming and sitting. Test subjects were fitted with three Inertial Measurements Units (IMUs), one on wrist, ankle and chest, and a heart rate monitor. Measurements were registered every 0.1 seconds. Each IMU measured 3D acceleration, 3D gyroscopic and 3D magnetometer data. One data set thus consisted of 28-dimensional data points indexed by 0.1 second timesteps.
We looked at two activities in this case study: ascending and descending stairs. At the outset one would expect these activities to be very similar and therefore difficult to distinguish. For persistence analysis we randomly sampled without replacement 100 points from each data set, repeated 100 times. For each of the 7 subjects we thus obtained 100 resamplings from their activity data. We computed and persistence for each sampling. The classification procedure was the same as outlined in Section 10.2 except we combined both homologies in the classifier as follows. We took the mean of 40 out of 100 stable ranks both in and . We thus obtained 14 classifier pairs corresponding to all (subject, activity) classes. Remaining 60 stable ranks formed test data pairs in each class. For a pair we then considered
[TABLE]
Again the classification is successful if the minimum is obtained with and belonging to the same (subject, activity) class.
Results for 20-fold random subsampling cross-validation are shown on the left in Figure 9 for the standard contour, with the overall accuracy of 60%. For the classification with a different contour we use the standard contour for and the shift contour of Figure 10 for . The results are shown on the right in Figure 9. The shift contour increases lifespans of features appearing with larger filtration values. Exploring the stem plots (Figure 10) for different data sets shows that larger filtration values have bars sparsely (some data sets having no bars) and their lengths vary significantly between different classes of data. This observation leads to use the contour emphasizing bars in the larger filtration values, by which the classification accuracy increases to 65%. Note particularly increase in the accuracy of subject 4. Also noteworthy is that ascendings mainly get confused with ascendings and the same for descendings. These data thus exhibit clearly different character and using an appropriate contour makes this difference more pronounced.
10.4 Choosing the contour
The examples above illustrate how one does data analysis by choosing a more optimal contour which gives a pseudometric on . In the examples this choice was made by visually inspecting stem plots and contour lines. The next step of our pipeline is to automate this process. This is the central reason behind contours and induced metrics: we want to optimize over the space of metrics to find more distinguishing invariants for objects in . This leads to optimizing in an appropriate function space since contours arise from density functions (see Section 5). Another way would be to represent densities by functions whose shape is controlled by few parameters, such as a beta distribution with shape parameters and . The optimization then reduces to these parameters.
Appendix A Proofs of Propositions from Section 2
Proof of Proposition 2.1.
(1): If , there is nothing to prove. Assume . For in , there are inclusions and which imply and . As this happens for all , we can conclude .
(2): Using (1), it is enough to prove that, for non-increasing functions and :
[TABLE]
The inequality is clear if . Assume there is such that and for any . This together with the fact that and are non-increasing imply and where and is the function . The desired inequality is then a consequence of being non-increasing and the fact for in and which give:
[TABLE]
[TABLE]
∎
Proof of Proposition 2.2.
Let be a non-decreasing sequence of pseudometrics on . Choose in . For in , set to be the biggest natural number not bigger than . Define and . In this way, any leads to a new non-decreasing sequence of pseudometrics on . This new sequence is also non-decreasing. Let be the function corresponding to this new sequence as defined in 2.2.(2). Since is constant on intervals of the form where is a natural number, the function is Lebesgue measurable as it is constant on left closed rectangles that cover . Note that is the limit of as goes to [math]. As a limit of measurable functions, is then also measurable. ∎
Proof of Proposition 2.3.
Since to prove this proposition one can use exactly the same arguments as in the proof of Proposition 2.1, we illustrate how to show statement (1) only. If , then the statement is clear. Assume . Since is non-decreasing, for any in , we have an inclusion which yields . By symmetry also , and hence . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] U. Bauer. Ripser software. github.com/Ripser/ripser .
- 2[2] P. Bendich, S. Chin, J. Clarke, J. Desena, J. Harer, E. Munch, A. Newman, D. Porter, D. Rouse, N. Strawn, and A. Watkins. Topological and statistical behavior classifiers for tracking applications. IEEE Transactions on Aerospace and Electronic Systems , 52:2644–2661, 2016.
- 3[3] P. Bendich, J. Marron, E. Miller, A. Pieloch, and S. Skwerer. Persistent homology analysis of brain artery trees. The Annals of Applied Statistics , 10:198–218, 2016.
- 4[4] C. Biscio and J. Møller. The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point process applications. ar Xiv:1611.00630 , 2016.
- 5[5] Peter Bubenik, Vin de Silva, and Jonathan Scott. Metrics for generalized persistence modules. Foundations of Computational Mathematics , 15(6):1501–1531, 2015.
- 6[6] Oliver Gäfvert. Algorithms for multidimensional persistence. Master’s Thesis, KTH , 2016.
- 7[7] Oliver Gäfvert and Wojciech Chachólski. Stable invariants for multidimensional persistence. ar Xiv:1703.03632 , 2017.
- 8[8] Y. Hiraoka, T. Nakamura, A. Hirata, E. Escolar, K. Matsue, and Y. Nishiura. Hierarchical structures of amorphous solids characterized by persistent homology. Proceedings of the National Academy of Sciences , 113:7035–7040, 2016.
