On limit theorems for persistent Betti numbers from dependent data
Johannes Krebs

TL;DR
This paper establishes limit theorems for persistent Betti numbers derived from dependent time series and random fields, extending previous results beyond independent or stationary point process data.
Contribution
It provides the first general limit theorems for persistent Betti numbers in dependent data settings, broadening the applicability of topological data analysis.
Findings
Derived limit theorems for persistent Betti numbers under dependence
Extended convergence results to time series and random fields
Applicable to a wide range of dependence structures
Abstract
We study persistent Betti numbers and persistence diagrams obtained from time series and random fields. It is well known that the persistent Betti function is an efficient descriptor of the topology of a point cloud. So far, convergence results for the -persistent Betti number of the th homology group, , were mainly considered for finite-dimensional point cloud data obtained from i.i.d. observations or stationary point processes such as a Poisson process. In this article, we extend these considerations. We derive limit theorems for the pointwise convergence of persistent Betti numbers in the critical regime under quite general dependence settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On limit theorems for persistent Betti numbers from dependent data
Johannes Krebs
Abstract
We study persistent Betti numbers and persistence diagrams obtained from time series and random fields. It is well known that the persistent Betti function is an efficient descriptor of the topology of a point cloud. So far, convergence results for the -persistent Betti number of the th homology group, , were mainly considered for finite-dimensional point cloud data obtained from i.i.d. observations or stationary point processes such as a Poisson process. In this article, we extend these considerations. We derive limit theorems for the pointwise convergence of persistent Betti numbers in the critical regime under quite general dependence settings.
keywords:
Critical regime , Dependent data , Functional data , Limit theorems , Markov Chains , Marton Coupling , Persistent Betti numbers , Persistence diagrams , Point processes , Time series , Topological data analysis , Random fields , Random geometric complexes.
MSC:
[2010] Primary: 60D05 , 60G55 , Secondary: 60F10 , 37M10 , 60G60.
††journal: arXiv.org
\stackMath
\affiliation
[1] organization=Institute for Applied Mathematics, Heidelberg University,addressline=Im Neuenheimer Feld 205, city=Heidelberg, postcode=69120, country=Germany
Topological data analysis (TDA) is a comparably young field in (applied) mathematics at the intersection between computational geometry, probability theory, mathematical statistics and machine learning. Seminal papers which popularized the ideas of TDA are Edelsbrunner et al. (2000), Zomorodian and Carlsson (2005), Carlsson (2009). An introduction offers the monograph of Boissonnat et al. (2018). Statistical aspects of TDA are discussed in the surveys of Chazal and Michel (2017) and Bobrowski and Kahle (2018).
In this article, we will focus on a special topic in persistent homology, which itself is the major branch in TDA: We study the large sample behavior of persistent Betti numbers and the corresponding persistence diagram obtained from time series or random fields.
So far, the literature has focused on point cloud data obtained from two major sources. On the one hand, there are various limit theorems for persistent Betti numbers obtained from stationary point processes as a rather general class, a prominent example here is the homogeneous Poisson process. On the other hand, the binomial process (a sample of i.i.d. data) is intensely studied, too.
In early contributions, Kahle (2011) investigates the asymptotic behavior of Betti numbers in the sub-, supercritical and critical regime. Extensions are given by Kahle and Meckes (2013) and Yogeshwaran and Adler (2015). From the above mentioned three asymptotic regimes, the critical (or thermodynamic) regime certainly gets the most attention and in the following, we will also limit the discussion in the introduction to this case.
One of the first major contributions which studies large deviation inequalities and central limit theorems for the Poisson and binomial sampling scheme in the critical regime is the work of Yogeshwaran et al. (2017). Extensions to persistent Betti numbers and persistence diagrams are given in Hiraoka et al. (2018). Trinh (2019) provides an abstract result for the asymptotic normality of Betti numbers. Krebs and Polonik (2019) study the stabilizing properties of and related central limit theorems from Betti numbers built from non homogeneous Poisson or binomial processes. Strong laws of large numbers for Betti numbers obtained from the Poisson or the binomial process on general manifolds are considered in Goel et al. (2018). Other recent contributions which also discuss limiting results for Betti numbers are Owada (2018), Owada and Thomas (2020). Divol and Polonik (2019) study the limiting behavior of the persistence diagram.
In the context of time series, the behavior of Betti numbers has been mainly investigated in applications. Islambekov et al. (2020) combine the TDA methodology with classical methods for change point detection. Classification problems for time series using methods from TDA are considered in Seversky et al. (2016) and in Umeda (2017). The applications of TDA to networks obtained from financial data are studied in Gidea (2017) and Gidea and Katz (2018); here the methods of TDA measure a type of high-dimensional and time-dependent correlation in the network.
The persistence landscape (Bubenik (2015)) is an efficient summary statistic of the persistence diagram and is quite popular in machine learning; we also refer to Chazal et al. (2014) and Kim et al. (2020) for related contributions.
The aim of this paper, is to provide two advances in the study of persistent Betti numbers in the context of time series and random fields. On the one hand, we study the large sample behavior of the expectation of persistent Betti numbers obtained from time series and random fields. More precisely, for the time series case, let be a stationary Markov chain of order (w.r.t. its natural filtration) with a continuous and strictly positive joint density of . Write for the marginal density of each . It is well-known that for an -binomial process , which consists of i.i.d. observations with marginal density , the limit of exists. Using the nearly additive properties of persistent Betti numbers, we show that Markov chains converge to the same limit. In fact, denoting the Čech or Vietoris-Rips filtration, we have
[TABLE]
and where is the limit of for an -binomial process on with uniform density . We also prove a related strong law of large numbers. Doing so, we can also conclude convergence results for persistence diagrams. Moreover, we establish similar convergence results for stationary random fields.
On the other hand, we establish an exponential inequality and give strong laws of large numbers for persistent Betti numbers, which are not exclusively derived from point clouds on . Instead, we also allow for functional data as a potential data source. The presented exponential inequality relies on the concept of the Marton coupling, see Marton (2003). Marton couplings have also been successfully used in the past to derive concentration inequalities of the McDiarmid-type, see also Samson (2000) and Paulin (2015).
The remainder of this paper is organized as follows. In Section 1, we give the notation used throughout the manuscript. Furthermore, we outline the basic concept of persistent homology. In Section 2, we describe the dependence structure assumed for our time series model and present our main results related to the time series case. In Section 3, we study the extension of our results to random fields. The proofs are contained in Section 4; further deferred calculations are contained in A.
1 Notation
The purpose of this section is not to make the paper self-contained which is impossible. The aim is rather to allow the reader from other areas to become familiar with the vocabulary and to understand the basic concepts of topological data analysis.
We begin with some general notation. We write for the natural numbers starting at 1; if we include 0, we write . We write for the cardinality of a countable set . We work on a separable Banach space . We write for the metric which is obtained from the norm on and for the Borel--field on . is equipped with the measure . The measure is non-atomic and -finite. Then we write for the closed -ball around . The diameter of a set is . Let and write for its -dimensional Lebesgue measure as well as for its -offset.
Write for the -fold product measure on the product space . The essential supremum of a real-valued function defined on is abbreviated by . We write simply for the supremum norm of a continuous function on .
Let be a probability space and let be two state spaces. Consider two random variables and . Assume that admits a conditional distribution given . We write for this distribution.
In order to abbreviate a subset of an ordered sample , say, we write for the subset , . Given a time series , we write for the associated point cloud which has no ordering.
Given a metric space and Radon measures , we say that converges vaguely to if
[TABLE]
where is the class of all continuous functions on with compact support. We indicate this writing .
We construct the filtration from the Čech or the Vietoris-Rips complex. If is a finite subset of and , these complexes are defined by
[TABLE]
In the following, the writing refers to both the Čech and the Vietoris-Rips complex. If we want additionally to precise the point cloud or the filtration parameter , we write or . The corresponding filtration is given by . It is a direct consequence of the homogeneity of that for the complexes and are combinatorially isomorphic.
The dimension of a simplex is its cardinality minus 1. If has dimension , it is a -simplex. Write for the set of -simplices in a complex . Moreover, for a measurable set and a point cloud , we write for the number if -simplices in with at least one vertex in .
We use the field to build the homology groups and the Betti numbers of a simplicial complex . Define for the space of -chains to be the free Abelian group generated by the -simplices in . So the elements of are formal sums (“-chains”) , , a -simplex. The sum of two -chains is their symmetric difference because the coefficients are in .
The boundary operator relates and by mapping a -simplex to . For a general chain , the boundary operator is then .
The boundary operator satisfies (“a boundary has no boundary”). This property enables the construction of homology groups of . Let be the subspace of consisting of the -cycles, those elements whose boundary is 0 under . Let be the subspace of that consists of the boundaries of elements in (which lie in ).
The homology groups are defined as , the cycles modulo the boundaries in dimension . Loosely speaking, the elements in represent “holes” in the simplicial complex . These are closed loops, voids or cavities, whose interior cannot be filled by other elements of the complex. Similarly as and , is a vector space.
The th Betti number of a simplicial complex is the dimension of , viz.,
[TABLE]
So, is the number of -dimensional holes in . and provide topological information from a single simplicial complex. Given a filtration , the persistent homology provides more topological details. The natural inclusions and for , provide the inclusion map
[TABLE]
We define the persistent homology groups of the filtration by
[TABLE]
Loosely speaking, nonzero elements in represent topological features born before or at time and which persist until a time greater than . The dimension of , i.e., the number of these features, is the persistent Betti number.
Definition 1.1** (Persistent Betti number).**
Let be a filtration and let . The persistent Betti number of dimension for the parameter pair is
[TABLE]
At this point there is an important difference between the Čech and Vietoris-Rips complex in the special case where we consider the Euclidean space . While in the Čech complex the homology degree is bounded by , the Vietoris-Rips complex can have nontrivial cycles of every possible dimension (see also Bobrowski and Kahle (2018)).
The th persistence diagram summarizes the evolution of the homology groups; it is a multiset of points in . Each point in the th persistence diagram corresponds to a -dimensional hole (feature) in the filtration which is born (appears for the first time) at time and dies (disappears in the filtration) at time . The lifetime of this feature is called the persistence. means that the feature has an infinite lifetime. Persistence diagrams exist given mild assumptions on the filtration, see Chazal et al. (2016). Also in the case of a random point cloud, e.g., an i.i.d. sample, the persistence diagram can inherit certain smoothness properties from the point cloud, see Chazal and Divol (2018).
Let be the th persistence diagram given as a multiset of points. Then in the following we understand as a counting measure on defined as
[TABLE]
is related to the th persistent Betti number as follows
[TABLE]
This means counts the number of -dimensional features in the upper left rectangular area with vertex in the persistence diagram. So given , the persistent Betti number represents the number of -dimensional features with a high persistence. It is clear that the values of the th persistence diagram also describe the persistent Betti function completely.
2 Persistent Betti numbers obtained from time series
This section contains the main results of this paper. We derive an exponential inequality for persistent Betti numbers from a rather general class of stochastic processes, which also applies to functional data and random fields after a renumeration of the coordinates, we will see this below. For the special case of an -valued time series, we also give the large sample behavior of the expectation and study the vague convergence of the corresponding persistence diagram.
2.1 The data generating process
Consider a stationary process defined on and taking values in . (A special case would be equipped with the Borel--field and the Lebesgue measure.)
The observations admit a density w.r.t. . Furthermore, the observations admit conditional densities as follows. The distribution of conditional on , , admits a density for each . Also admits a density for all and all finite sets , which do not contain . Moreover, there is a such that uniformly
[TABLE]
for all and sets which do contain . These requirements are not restrictive and satisfied for a wide range of stochastic processes.
2.2 Marton couplings as the concept of dependence
We use the concept of Marton couplings to quantify the dependence within the observed data. These couplings were first defined in Marton (2003) and measure the strength of dependence within a collection of random variables by a mixing (or coupling) matrix.
Definition 2.1** (Marton coupling).**
Let and let be Polish. Let be a vector of random variables taking values in . A Marton coupling of is a set of couplings ), for every and every , which satisfies the conditions
[TABLE]
Write for the conditional distribution of given for . Construct a measure on the product space , which consists of the joint distribution of and the product measure for a Borel set of as follows:
[TABLE]
Then, we define the mixing matrix for a Marton coupling of as an upper diagonal matrix with and
[TABLE]
where we compute the essential supremum w.r.t. the measure . Note that for each entry in the mixing matrix is bounded above by
[TABLE]
where the supremum is taken over all .
We return to the data generating process . Write for the mixing matrix of the sample . As is stationary, (as long as all indices are between 1 and ). Consequently, for the choice . So the summation over all elements in line is equivalent to the summation over all elements in column (and vice versa), viz.,
[TABLE]
In particular, the maximum absolute column sum equals the maximum absolute row sum . In what follows, we assume that the coefficients of the mixing matrix are at least summable in the sense that
[TABLE]
Consider the spectral norm of the mixing matrix induced by the Euclidean norm on . Using , implies that is also uniformly bounded in the spectral norm over all , i.e., .
The condition on the mixing matrix in (A2) is satisfied for a wide range of stochastic processes. Consider for instance, so-called delay embeddings for time series.
Example 2.2** (Delay embeddings from Markov chains).**
Let be a stationary, uniformly geometrically ergodic Markov chain in a Polish space whose marginal distribution and transition kernel both admit a strictly positive density w.r.t. a reference measure . Construct a process from via a delay embedding, that is, , where are natural numbers. We show that this process satisfies (A2). We construct a Marton coupling , for every and every with Goldstein’s maximal coupling (Proposition A.4).
For every and all states, Goldstein’s maximal coupling yields two coupled random variables , such that (i), (ii) and (iii) from Definition 2.1 are satisfied. By Proposition A.4, the marginals of each coupling satisfy
[TABLE]
Note that the essential supremum of the left-hand side w.r.t. equals the coefficient . Thus, we can easily bound above the norm of the mixing matrix with the properties of the Markov chain . For simplicity, we use for and only consider the asymptotic properties for . Set . We can derive from the Markov property of that
[TABLE]
Next, we use the Markov property to see that the total variation distance in (2.2) is determined by the observation because is closest to . Consequently, if , (2.2) equals
[TABLE]
By assumption, is uniformly geometrically ergodic. Hence, there are and such that uniformly in and for all
[TABLE]
So the quantity in (2.3) is at most . In particular, we have for a row of the mixing matrix of the following bound, which implies (A2):
[TABLE]
Consequently, .
The mixing time of a (uniformly ergodic) Markov chain is defined by
[TABLE]
Hence, using the Markov property, we can also give an upper bound on (2.3) in terms of the mixing time by simply writing for and . Then (2.3) is at most 1 if or if and if . Consequently, one obtains the following upper bound for
[TABLE]
2.3 Covering the state space
As we consider general state spaces , we work with the following covering condition, which is satisfied in many examples.
Condition 2.3** (Covering condition).**
The state space is precompact. Write for the -covering number of w.r.t. , i.e., for each , admits a covering with closed balls w.r.t. the metric of radius located at positions .
Moreover for all there is a sequence of scaling factors , , such that
[TABLE]
for some .
Some discussion on the covering condition is needed. (A4) is clearly needed to bound above the complexity of the underlying metric space . Condition (A3) is more delicate as it regulates the ratio between the number of points and the -volume of the -ball. Many spaces satisfy this condition. We give here two examples, the finite-dimensional case, i.e., , and the functional case.
Example 2.4** (Coverings for finite-dimensional spaces).**
Consider the unit cube which is endowed with the -norm and the Lebesgue measure . For each , can be covered with disjoint cubes of side length at most , i.e., balls w.r.t. the -norm . In that case, one finds with geometric arguments that a ball of radius at some position can be covered with at most balls of radius at fixed positions .
In this Euclidean setting, three regimes are classically studied for random geometric complexes. In the subcritical regime , i.e., the scaling factors grow faster than . In the critical regime, this growth is balanced, so that . Moreover, in the supercritical regime.
We study the situation for a point cloud of points obtained from a stationary time series , whose marginals admit a density w.r.t. the Lebesgue measure. In the subcritical regime, the points of the rescaled point cloud tend to become more and more isolated as the number of points per volume tends to zero. In the critical regime, the number of points per volume from tends to a constant. In the supercritical regime, the points from the point cloud lie increasingly dense.
Clearly, scaling factors which achieve the thermodynamic regime (e.g. ) satisfy the condition from (A3) because . The covering number is proportional to in this case.
Moreover, note that we can also allow for a slower increase in which then yields a supercritical regime. For instance, still satisfies (A3) (and also (A4)) for each .
Scaling factors which achieve a subcritical regime satisfy (A3), however, (A4) restricts the growth rate from above; for instance, any polynomial rate is allowed for (A4).
Example 2.5** (Coverings of functional spaces).**
Let . We study the class of all functions on the unit interval that posses uniformly bounded derivatives on up to order (the greatest integer strictly smaller than ) and whose highest derivatives are Hölder continuous of order . Write for the th derivative of a function . Define
[TABLE]
Write for the set of continuous real-valued functions on with . We write for the supremum-norm of a real-valued function on and denote the corresponding -neighborhood of by (other norms are also possible). Then the covering number of w.r.t. satisfies by Theorem 2.7.1 in van der Vaart and Wellner (1996)
[TABLE]
for a certain constant which is independent of .
For each centered Gaussian measure on a real separable Banach space , there is a unique Hilbert space such that is determined by considering as an abstract Wiener space (Cameron-Martin space, see Gross (1970)). Then is dense in and we have for
[TABLE]
where , see Li and Shao (2001) Theorem 3.1. In the following, consider the centered Gaussian measure on the infinitely often differentiable functions determined by the covariance function for a given characteristic length-scale . In this case, it follows from Li and Shao (2001) Theorem 4.1 and Theorem 4.5 as well as some calculations that
[TABLE]
for all sufficiently small for certain .
For instance, define the scaling factor by for . Then on the one hand, we obtain which shows (A4). On the other hand, relying on (2.5)
[TABLE]
If , (A3) is satisfied and we have a functional subcritical regime in that ().
2.4 A concentration inequality for persistent Betti numbers from dependent data
We come to the first main result in this article which is an abstract exponential inequality for a certain class of functionals defined on the point clouds , .
Theorem 2.6**.**
Let the stochastic process satisfy assumption (A1) and (A2). Moreover, let (A3) and (A4) from Condition 2.3 be satisfied. Let be a functional defined on all finite subsets of . Set . The functional satisfies
- (1)
Universal bound. There are such that for all , and for all
[TABLE]
- (2)
Exchange-one cost are local. There are such that for all , and for all
[TABLE]
Set and let . Then for all and for all
[TABLE]
The persistent Betti function satisfies the conditions of Theorem 2.6, this follows from the Geometric Lemma (Lemma 4.3). More precisely, we have a result as follows.
Theorem 2.7**.**
Let the regularity conditions from Theorem 2.6 be satisfied. Then for each , for each , , for and such that the persistent Betti number from Definition 1.1 satisfies for each
[TABLE]
Hence, there are constants such that for each and
[TABLE]
In particular,
Yogeshwaran et al. (2017) obtain (among other results) an exponential inequality for Betti numbers computed from an -binomial process, whose marginals have a continuous and compactly supported density on . Their result, in particular, their rate is very similar to our result.
The abstract concentration inequality in Theorem 2.6 can be considered as a generalization of McDiarmid’s inequality for functionals of dependent random vectors whose martingale differences are not necessarily bounded. Technically it relies on the Marton coupling and an abstract concentration inequality of Chalker et al. (1999) for martingale differences as well as on the observation that point cloud data from the data generating process tends to evenly spread across the state space . So exchanging one point in the argument of the functional as in (LABEL:E:ExchOneCost) tends to have a much smaller impact than the worst case bound in (2.6).
Many important functionals in stochastic geometry do not possess a deterministic local exchange-one cost function as required in Theorem 2.6. Instead these functionals are often stabilizing (at an exponential rate). We refer to Penrose and Yukich (2001) and Lachièze-Rey et al. (2019) for an introduction and examples of stabilizing functionals. Loosely speaking stabilization implies that the exchange-one cost function is determined by the points in a “small” -neighborhood of the exchanged points with a high probability. So the abstract result in Theorem 2.6 will also be relevant for such stabilizing functionals because usually these can be truncated in such a way that the error is negligible for large sample sizes and that the exchange-one cost of the truncated functional actually satisfies (LABEL:E:ExchOneCost). Of course, this remark is an outline of how to apply Theorem 2.6 to such functionals and we leave it to future research to prove rigorously this claim.
2.5 Convergence results in Euclidean space
It remains to show that the normalized expectation of the persistent Betti numbers converges to a limit. Here we restrict our considerations to point cloud data on because of the following reason: In order to obtain limit theorems for persistent Betti numbers in the critical regime from dependent data realized on a general measure space such as a manifold or a function space, a possible way would be to first derive the limit of for a certain underlying Poisson process on this space (see Last and Penrose (2017) for the notion of a Poisson process on a general space). In this step, the topological properties of the underlying space are of course crucial. Second, one needs to apply a de-Poissonization argument to obtain a limit for the binomial process which treats the situation for an i.i.d. sample. Finally, as we will see from the applied techniques in the proofs, the nearly additive properties of persistent Betti numbers in the critical regime enable us to use certain continuity results and then allow us to conclude the case for dependent data. This entire procedure is quite comprehensive. So far, to the best of our knowledge, these extensions have only been considered for manifolds (Goel et al. (2018)) in the literature. For this reason we have decided to limit our considerations to -valued data.
We study two kind of processes in the critical regime, namely, (1) processes which can be coupled to a process with a discrete state space and (2) Markov chains of finite order.
First consider a process which is obtained from a stationary discrete process as follows. Let be a blocked density w.r.t. the Lebesgue measure on , i.e., there is an such that
[TABLE]
and where the subcubes partition . Note that the may have different volumes.
We assume that the process admits a Marton coupling which satisfies (A2) and takes values in a finite set , for , such that . Define with the help of by
[TABLE]
where the are independent and uniformly distributed on for and . Then if , . Hence, each admits a marginal density .
Then the conditional distribution of the process works as follows. In the first step and conditional on the past , we choose a subcube , according to
[TABLE]
In the second step, we choose at random a point in the subcube as the realization of .
Consequently, admits a Marton coupling which satisfies (A2). The conditional distribution of is invariant in the sense that
[TABLE]
for all , , , . If we can only observe the process , then we can think of as a hidden process. We have the following theorem.
Theorem 2.8**.**
Let be a -valued process which admits a Marton coupling that satisfies (A2). Each has a marginal density as in (A5) and (A6) such that . Then for each
[TABLE]
where is the limit of for an -binomial process with unit density on .
So the expectation of the persistent Betti number obtained from this kind of time series has the same limit properties as the corresponding binomial process.
We extend Theorem 2.8 to general marginal density functions which can be approximated by blocked density functions . To this end, we restrict ourselves to the case of uniformly ergodic Markov chains of order , viz., is a stationary process such that , for some . For such a Markov chain all transition probabilities are determined by the joint density of which is assumed to be continuous and strictly positive on in that . It is known that this kind of aperiodic restriction ensures the Markov chain to be uniformly geometrically ergodic, see also Meyn and Tweedie (2012) Theorem 16.0.2.
Furthermore, the limit on the right-hand side in (2.9) is continuous: Indeed, Divol and Polonik (2019) show that the limit
[TABLE]
exists, where are blocked density functions on (from a regular grid) as in (A5) which converge to in the -norm and where the (resp. ) have density (resp. ).
For this kind of Markov chains we obtain from the previous Theorem 2.8 the following result.
Theorem 2.9**.**
Let be a homogeneous Markov chain of order taking values in such that the joint density of is continuous and satisfies . The have marginal density . Then for each the convergence results from (2.9) and (2.10) in Theorem 2.8 are valid.
Consequently, we obtain also for this natural generalization of the binomial process the well-known limit. The generalization to arbitrary stationary processes which admit a Marton coupling is rather elaborate and complex. Actually, when following the current scheme of the proof, one first has to assume that this process can be coupled to a discrete process which approximates sufficiently closely in terms of the conditional distribution functions. This would mainly result in a complex notation. For this reason, we have limited our considerations to processes whose conditional distributions only depend on lags of its past, this is sufficient for many applications and also serves as an approximation to the general case.
We conclude with an immediate result which follows from the Theorem 2.8 and Theorem 2.9 and the work of Hiraoka et al. (2018) concerning the vague convergence of persistence diagrams.
Corollary 2.10** (Vague convergence of persistence diagrams obtained from dependent data).**
Let the assumptions of Theorem 2.8 or Theorem 2.9 be satisfied. Then for each , there is a Radon measure depending on such that and as .
3 Extensions to random fields
We extend the theory from above to random fields in two settings, these correspond then to the situations discussed in Theorems 2.8 and 2.9 for the time series case.
The extension to random fields requires mainly notational changes. We consider stationary random fields indexed by the regular -dimensional lattice . The main difference is the ordering of the data which we assume to be located in the subset . If are two positions on the lattice, we write () if and only if () for all . Moreover, we construct a total ordering on with the -norm as follows. Let , then
[TABLE]
where . The relations follow in the same spirit.
For a vector , we denote the cardinality of the corresponding -cube by . For a given a random field and an , we write for the associated point cloud , which represents the sample data. In the following, we will consider only such which satisfy
[TABLE]
for some constant . We write for a sequence which satisfy the relation (3.2) for each and also fulfills as .
Consider a Marton coupling of a stationary random field on . We define the mixing matrix w.r.t. the ordering . The line corresponding to location in the mixing matrix is given by
[TABLE]
where and where is defined in the same spirit as in (2.1).
We study the structure of the entries of the mixing matrix in a simple example. Consider a stationary random field on the lattice whose joint distribution can entirely be described by four (conditional) densities and . This means for any the joint distribution can be simulated with these four (conditional) densities and we can do this also using the ordering , beginning at the corner point (1,1). So we first simulate according to . All observations for (resp. for ) are simulated with (resp. ). All remaining observations are simulated with the conditional density . Figure 1 illustrates the scheme.
Consider a location in the lattice and a configuration of the Marton coupling which agrees at all locations of the past of w.r.t. (all locations with ). Consider a point in the future of w.r.t. (all locations with ). Then the distributions of and are affected by the different configurations at location if and only if . Hence, (3.3) is only affected by the locations which satisfy , which is a strict subset of all those locations which satisfy .
We come to the description of the dependence patterns. First we consider again the blocked density function from (A5) and proceed as in the case for time series. Let be a stationary random field on the regular -dimensional lattice. The state space of is discrete, i.e., , , for , such that . Also admits a Marton coupling whose mixing matrix from (3.3) satisfies (similar as in (A2))
[TABLE]
Define a new random field with the help of by
[TABLE]
where the are independent and uniformly distributed on for and . Then if , . In particular, each has a density on . Also all other properties from the time series case are inherited. So, we have once more an invariance property as in (2.8): For and such that
[TABLE]
for all , , for all such that and .
Consequently, we obtain the following generalized variant of Theorem 2.8.
Theorem 3.1**.**
Let be a -valued random field on , which admits a Marton coupling that satisfies (A7) w.r.t. for each . Each has a marginal density as in (A8) such that . Then for each
[TABLE]
We refer to Chazottes et al. (2007); Külske (2003) who consider couplings for high-temperature Gibbs measures for the discrete random field whose components take the values in . Given certain upper bounds on the dependence within the random field, they obtain for the two state Gibbs model a coupling which satisfies
[TABLE]
for a certain constant . So the probability of an unsuccessful coupling decays exponentially fast in the -distance on the lattice, which is the minimal number of edges between and w.r.t. the standard -neighborhood structure. In particular, the Marton coupling satisfies (A7).
For a generalization of Theorem 2.9 we need a decay assumption on the mixing matrices. In the case of a Markov chain of finite order, Theorem 16.0.2 in Meyn and Tweedie (2012) states that strictly positive and continuous conditional densities ensure uniform geometric ergodicity. So concerning the Marton coupling, we obtain a mixing matrix whose entries in one line decay at an exponential rate.
For random fields the situation is far more complex. To this end, we restrict ourselves to stationary Markov random fields of order 1 w.r.t. the neighborhood structure of the regular lattice whose joint distribution can be described with (conditional) density functions
[TABLE]
is the marginal density. More precisely, the distribution can be modeled with a scheme as in Figure 1, however, on a -dimensional lattice. The conditional density describes the transition within the set , where .
We give an example for a cube . First we can simulate the random variable in the lower left corner according to . Let be the standard basis elements of for , i.e., the vector whose th entry is 1 and 0 otherwise. Then the conditional densities describe the transition on the coordinate axes of the cube. Similarly, with the remaining functions , , we can completely simulate the transition on the lower envelope of the cube, i.e., the locations which are zero in at least one coordinate. Finally, the conditional density describes the transition to those locations , which are nonzero each entry.
It is an important fact that due to the Markov structure we can factorize the distribution of the random field on with these conditional densities and use the ordering in the same time.
In contrast to the one-dimensional situation of a Markov chain, it is this time not enough that the conditional densities from (A9) are strictly positive in order to ensure a successful Marton coupling. To this end, we assume that the dependence within decays at a polynomial rate in the sense that
[TABLE]
where , and where is the mixing matrix of the entire random field. Note that a uniform exponential decay as in (3.6) is obviously sufficient for (A10). Note that due to the factorization property of from (A9), the mixing matrix at position is nontrivial if and only if . Also due to stationarity, it is entirely determined by the entries , . Using last condition on the decay, we conclude with a generalized convergence result for persistent Betti numbers obtained from Markov random fields.
Theorem 3.2**.**
Let the stationary random field be given by the (conditional) density functions , , from (A9), which are all continuous. Each is strictly positive on in that . Let the mixing matrix of satisfy (A10). Then fulfills the convergence results from (3.4) and (3.5).
4 Technical results
4.1 Helpful tools
Before we come to the proofs of the central results, we start with some auxiliary results.
Lemma 4.1** (Concentration inequality for bounded transition kernels).**
Let be a sequence whose components take values in the measure space . Moreover assume that the conditional distributions admit a conditional density . These densities are uniformly bounded in the sense that the first part of (A1) holds, i.e.,
[TABLE]
Let be a sequence of measurable sets such that , for certain . Then there is a constant such that for all and
[TABLE]
In particular, let be an -valued homogenous Markov chain which admits uniformly bounded conditional densities. For each , let be the -neighborhood of a point w.r.t. the Euclidean distance such that . Then (4.1) holds with .
Proof.
First, we bound the Laplace transform of w.r.t. , where is the natural filtration of the process with being trivial. We have
[TABLE]
Thus, we obtain for the entire process , using Markov’s inequality. This finishes the proof because . ∎
The next lemma is a generalization of Lemma 3.1 in Yogeshwaran et al. (2017).
Lemma 4.2**.**
Let and be a process which takes values in a measure space with a non-atomic measure . Let be a set of distinct natural numbers. Assume that the distribution of the vector , when conditioned on another observation , , admits a density w.r.t. . Assume that these densities are uniformly essentially bounded in the sense that the second part of (A1) holds, i.e.,
[TABLE]
Then for each and for each it is true that
[TABLE]
In particular, if is a subset of , and equals the Lebesgue measure, then (4.2) is of constant order and (4.2) is of order .
Proof.
We only prove the statement in (4.2), the statement in (4.3) follows in the same fashion. The first inequality in (4.2) is obvious. Thus, we only show the second one. Observe that
[TABLE]
because the distance between any two points in a -simplex in the Čech or the Vietoris-Rips complex is at most . On the one hand,
[TABLE]
and also On the other hand
[TABLE]
Combining (4.5), (4.6) with (4.4) yields the conclusion. ∎
The following result is well-known to topologists.
Lemma 4.3** (Geometric Lemma, Lemma 2.11 in Hiraoka et al. (2018)).**
Let be two finite point sets in . Then
4.2 Technical details on Section 2
We come to the proof of Theorem 2.7. Similar as in Yogeshwaran et al. (2017), we use a result of Chalker et al. (1999) to establish an exponential inequality without the need of bounding the martingale differences in the supremum-norm.
Proof of Theorem 2.6.
Consider the natural filtration of the process , for with the convention that . We rewrite in terms of martingale differences as follows
[TABLE]
where . An abstract result of Chalker et al. (1999) yields
[TABLE]
for any . Hence, it remains to compute bounds of . In all cases, we have the universal bound from (2.6). So, .
Next, we investigate the probabilities on the right-hand side of (4.8). Define for and
[TABLE]
Write for the conditional distribution of given on , viz.,
[TABLE]
Then, it follows with elementary calculations that for each
[TABLE]
Let be arbitrary but fixed. Choose such that and . Consider the Marton coupling of and write for the point cloud associated to the coupling element . The notation is used for the point cloud obtained from the counterpart . Consequently,
[TABLE]
(by abusing the notation slightly). We write and and consider the difference of the functionals in (LABEL:Eq:AbstractExpIneq4). The point clouds and in (LABEL:Eq:AbstractExpIneq4) differ at most in entries for each . These entries are and . Thus, we can transform into in steps exchanging one entry in each step, i.e., we consider the transformations
[TABLE]
where , for .
Using this definition, the difference of the functionals in (LABEL:Eq:AbstractExpIneq4) is bounded above by
[TABLE]
The symmetric difference is at most for and for . Let ; clearly, if , then . If , we can use (LABEL:E:ExchOneCost), which states that the exchange-one cost are local, to obtain
[TABLE]
A similar argument applies to , which admits the same bound as in (4.12) using the points
Write for the -covering number of w.r.t. from Condition 2.3. Use the family of coverings to define for each and each the set
[TABLE]
Loosely speaking, when considering only sets in , we can control the degree of accumulation within a point cloud on if we choose for some fixed . Using the definition of the set , (LABEL:Eq:AbstractExpIneq4) is at most
[TABLE]
Clearly, if both and are in , then each point cloud of the type and is in .
By Condition 2.3 there is a covering of . Consequently, there is a such that and the neighborhood is contained in . So, in the case where and are both in , we have
[TABLE]
the same applies to the neighborhoods of . Thus, we obtain an upper bound of the following type for the integral in (LABEL:Eq:AbstractExpIneq4)
[TABLE]
The last term in (LABEL:Eq:AbstractExpIneq7) is at most uniformly in and due to the condition from (A2), which implies
[TABLE]
Moreover as was arbitrary, this last bound from (LABEL:Eq:AbstractExpIneq7) is also true for the limit and we have
[TABLE]
where all constants are uniform in and . We fix the value of in the following as and return to (LABEL:Eq:AbstractExpIneq4). We choose . Thus, using (LABEL:Eq:AbstractExpIneq7b)
[TABLE]
where we use Markov’s inequality in the last step. We bound above both expectations in (LABEL:Eq:AbstractExpIneq8) as follows: First note that for and
[TABLE]
In particular, we have for each
[TABLE]
Additionally, we apply Lemma 4.1 to and use that is at most 1. Then we obtain for a state
[TABLE]
Combining this last inequality with (LABEL:Eq:AbstractExpIneq8), we see that
[TABLE]
Moreover, inserting this result in (4.8) for the above choice of and , yields
[TABLE]
Finally, applying the definition of completes the proof. ∎
Proof of Theorem 2.7.
It remains to verify that the persistent Betti function satisfies the condition in (2.6) and in (LABEL:E:ExchOneCost). It follows from the definition of Betti numbers that (2.6) is satisfied for and the exponent .
Next, we inspect the condition in (LABEL:E:ExchOneCost). Let be two point clouds of points, which differ in exactly one point, viz., . We can use the Geometric Lemma (Lemma 4.3) to obtain
[TABLE]
where we use for the last inequality the scaling relation , which is valid for the Čech and the Vietoris-Rips complex for all because of the homogeneity of .
Observe that a -simplex in the filtration has a diameter of at most . Thus, a -simplex with a node in a point , resp. , lies in the -neighborhood of , resp. . Consequently, (4.17) is at most
[TABLE]
Hence, the condition in (LABEL:E:ExchOneCost) is satisfied with , and exponent . ∎
Proof of Theorem 2.8.
We split the proof in three parts. We show in the first part that
[TABLE]
Define a filtration, which is the union of the single filtrations when restricted to the cubes , by the complexes
[TABLE]
Since this union is of disjoint complexes, we have . We use Lemma 4.3 and Lemma 4.2 to arrive at
[TABLE]
Then is of order . So, we can consider the expectation on the blocks instead.
From now let be an arbitrary but fixed index. Write for the edge lengths of . So that equals . Also write for the diagonal matrix . Note that . This completes the first part.
In the second part, we use McDiarmid’s inequality from Theorem A.2. Set and . Since admits a Marton coupling which satisfies (A2), we can apply Theorem A.2 to arrive at
[TABLE]
Using the definition and the fact that the Betti numbers of dimension are polynomially bounded by , we obtain
[TABLE]
In particular, is negligible and we can focus in the following on the restriction . For this purpose, write , then it follows from Lemma 5.5 in Krebs and Polonik (2019) that for each
[TABLE]
where the are independent and uniformly distributed on . We will use (LABEL:E:UnifConvergenceBettiPoisson) later.
In the third part, we study the success runs of and the sum : If an falls in , we term this a success and a failure otherwise. Consider a path with exactly successes , where and where each is a sequence of 1’s and each a sequence of 0’s (potentially and have length 0) for . So, on the path , we have .
Consider the expectation on this path . For this write for the index set which contains the positions in that mark a success. Write for the conditional distribution of given the past . Then
[TABLE]
Consider the situation for the last success which is given at a position . Note that each admits a conditional density because the distribution of on each block , , is uniform and independent of the past observations given that falls in the block . So this is constant for all from a block . Due to the blocked structure of the conditional densities of and the invariance property from (2.8), the contribution of the observations to the integral in (LABEL:E:ConvergenceExpectationDiscrete1) is then
[TABLE]
where is an arbitrary but fixed element of and where we use for the last equation that
[TABLE]
Using recursively this conditional independence argument, one obtains for (LABEL:E:ConvergenceExpectationDiscrete1)
[TABLE]
where the are independent and uniformly distributed on .
Moreover, using the uniform approximation result from (LABEL:E:UnifConvergenceBettiPoisson) shows that (4.20) equals
[TABLE]
where the remainder is uniform in and . Furthermore, using the dilatation rules of the expectation of persistent Betti numbers computed from the Čech or Vietoris-Rips filtration, we obtain for the main term of (4.21)
[TABLE]
where the remainder is uniform in and where the last equality follows as in the proof of Lemma 10 in Divol and Polonik (2019). Summing over all paths with exactly successes, over all and over all yields then the conclusion, viz.,
[TABLE]
This proves the first assertion in (2.9). Combining this last statement with Theorem 2.7 and the Borel-Cantelli-Lemma shows the second assertion in (2.10). So, the proof is complete. ∎
Proof of Theorem 2.9.
In the proof, we sometimes abuse the notation slightly in order to keep formulas shorter. To be precise, we write for the simplicial complex of a vector at filtration time to save space. The related expressions are abbreviated in this way, too.
In the first step of the proof, we construct a discrete Markov chain of order , , which approximates closely. To this end, let be arbitrary but fixed. We use a discrete density function , which is an approximation of the joint density of . We write for the conditional density of given with the convention that is the marginal density .
Since we assume that the process is a Markov chain of order , we are actually dealing with the conditional densities only. Using the approximation , we obtain approximations , which are defined in the same spirit as the . We choose the precision between and sufficiently high (in the -norm) such that
[TABLE]
Thus, at each step of the evolution of the Markov chain, we can approximate each conditional density with a discrete conditional density at a precision of at least (measured in the total variation distance). Note that this is possible because we assume that , so that all conditional densities are well defined.
We write for the Markov chain of order obtained from the above -approximation scheme, note that we can also choose to be strictly positive. In particular, this implies that satisfies the assumptions of Theorem 2.8 because it is uniformly geometrically ergodic, see Meyn and Tweedie (2012) Theorem 16.0.2. Clearly, also is uniformly geometrically ergodic, hence, admits a Marton coupling (see also Example 2.2 and Paulin (2015) Proposition 2.4), whose mixing matrix (based on ) satisfies
[TABLE]
In the second step, we use the decomposition
[TABLE]
where the random variables (resp. ) have density (resp. ).
If converges to 0, (4.26) converges to 0, too, see as well (2.11) and Divol and Polonik (2019) for details. Moreover, from Theorem 2.8, we conclude that (4.25) converges to 0 as tends to for each and corresponding approximation .
So, the term in (4.24) remains. Let and be such that . For the remainder of this proof, we show that (4.24) is at most uniformly in , where the constant does neither depend on the choice of the approximation parameter nor on . It solely depends on and . For this we rewrite the expectations in (4.24) as
[TABLE]
We transform (LABEL:EQ:ConvergenceExpectationContinuous6) in (LABEL:EQ:ConvergenceExpectationContinuous7) in -steps using a specific coupling in each step. For this purpose we write the difference between (LABEL:EQ:ConvergenceExpectationContinuous6) and (LABEL:EQ:ConvergenceExpectationContinuous7) as a telescopic sum as follows (the exchanged factor is given in square parentheses)
[TABLE]
Each integral in the sum can be interpreted as a difference between the expectation of two persistent Betti numbers of two coupled processes for . We explain this coupling in three steps and refer to the term in (LABEL:EQ:ConvergenceExpectationContinuous8), which shows the general situation. First, the th coupling starts with ; so and have the same distribution as the stationary discrete Markov chain (with the densities from time 1 to .
Second, at time , we simulate a random variable using the conditional density (where the index depends on the position of ). Also at time , we simulate a random variable using the conditional density . Note that and can be coupled such that
[TABLE]
because of the choices in (LABEL:EQ:ConvergenceExpectationContinuous1); we refer to Den Hollander (2012) for an abstract maximal coupling result on Polish spaces.
Third, we find two chains and , , using the conditional densities such that the single elements at time satisfy . This last inequality follows from the properties of the Marton coupling, see (4.23).
In the following, we will use the abbreviation for the vector ; we use the notation in the same spirit for . Using the above coupling, we see that (4.24) is at most
[TABLE]
So for each the point clouds differ at most in the points resp. and we can always transform one point cloud in the other in steps.
Regarding (4.31), we show that there is a such that for each and for each the coupling satisfies
[TABLE]
Define the coupling time between and by
[TABLE]
i.e., for all the chains evolve again in lockstep, viz., for . Note that given , we have .
The coupling times admit a tail bound which involves the coefficients from the Marton coupling as follows: If , then
[TABLE]
Since the coefficients of the Marton coupling satisfy (4.23), this shows that the moments of the coupling times (when conditioned on ) are uniformly bounded: We have for
[TABLE]
We begin our considerations with the restriction to the event for :
[TABLE]
where we use for the last equality that can be at least conditional on the event . We study the expectations in (LABEL:EQ:ConvergenceExpectationContinuous16). First we apply the Geometric Lemma to obtain
[TABLE]
Consider the th moment of a simplex count for some . For this purpose denote by the data . Then one finds with elementary combinatorial arguments that the th moment of the simplex count in the first line in (LABEL:EQ:ConvergenceExpectationContinuous17) is at most
[TABLE]
for pairwise disjoint indices . The last inequality can be derived as follows: The number of different observations is in . Given pairwise different indices (observations ) each occurs with multiplicity such that .
Applying finally the same reasoning as in the proof of Lemma 4.2 shows that (LABEL:EQ:ConvergenceExpectationContinuous18) is bounded above by a universal constant , which depends on but not on . Clearly, the th moment of the simplex count in the second line in (LABEL:EQ:ConvergenceExpectationContinuous17) is at most , too.
We can now return to (LABEL:EQ:ConvergenceExpectationContinuous16). Set for some . Then relying on the coupling result from (4.30), (LABEL:EQ:ConvergenceExpectationContinuous16) is at most
[TABLE]
Relying on (4.23), (4.33) and (4.34) this last term is of order . This shows that (LABEL:EQ:ConvergenceExpectationContinuous16) is of order uniformly in and .
In order to complement the considerations following (LABEL:EQ:ConvergenceExpectationContinuous16), it remains to consider the restriction to the event for . Here we need additionally to consider the average over all
[TABLE]
Carrying out similar calculations as in (LABEL:EQ:ConvergenceExpectationContinuous17) and (LABEL:EQ:ConvergenceExpectationContinuous18) it is not difficult to see that (4.38) is at most
[TABLE]
Using once more the result in (4.33), (4.34) and (4.23) yields directly that this last upper bound vanishes.
Combining these results yields (4.32). This completes the proof. ∎
Proof of Corollary 2.10.
It is shown in Proposition 3.4 in Hiraoka et al. (2018) that the pointwise convergence of persistent Betti numbers implies the vague convergence of the corresponding sequence of persistent diagrams. ∎
4.3 Technical details on Section 3
Proof of Theorem 3.1.
The statement follows immediately from Theorem 2.8. ∎
Proof of Theorem 3.2.
The proof is very similar to that of Theorem 2.9 and we only study the main differences in detail. First, we construct an -approximation of . For this purpose we consider the joint distribution of , which is completely determined by the joint density . Let and choose a discrete approximation of such that conditional densities () are derived from in the same spirit as in the proof of Theorem 2.9; we refer to Figure 1. These densities are strictly positive and satisfy
[TABLE]
as well as . (This requirement is the analog to (LABEL:EQ:ConvergenceExpectationContinuous1)). Obviously, the discrete (conditional) densities determine the random field completely. Also, due to the blocked structure of the densities and the condition from (A10), the random field satisfies the requirements of Theorem 3.1. Reasoning as in (4.24) to (4.26), it is sufficient to study the difference
[TABLE]
for an arbitrary but fixed .
We use the same expansion for this difference as in the case of Markov chains, see (LABEL:EQ:ConvergenceExpectationContinuous6), (LABEL:EQ:ConvergenceExpectationContinuous7) and (LABEL:EQ:ConvergenceExpectationContinuous8). But this time we use the ordering for the expansion. We obtain for each a coupling with the properties
[TABLE]
Consequently, using that , for all , we can write the difference in (4.39) as
[TABLE]
Given a location and a coupling , we define the coupling time
[TABLE]
So is determined by the causal dependence pattern which is derived from the factorization of the joint distribution according to the ordering . Note that both random fields and move in lockstep after .
Consider the tail of the coupling time at location
[TABLE]
for a constant , which depends on but not on .
Choose such that we have with the abbreviation that , which is possible because by assumption .
In the following, we will consider such a single difference in (LABEL:EQ:ConvExpMRF2) and show that it is of order uniformly in ; the calculations follow in a similar spirit as in the proof of Theorem 2.9, see (LABEL:EQ:ConvergenceExpectationContinuous16) to (LABEL:EQ:ConvergenceExpectationContinuous18), so we omit some details. Clearly, we can restrict our considerations to the event . Again, use but this time for . Then using a similar bound on simplex counts, there is a constant such that
[TABLE]
for all and for all . Regarding the sum in (LABEL:EQ:ConvExpMRF4), we use the definition of to see
[TABLE]
Applying the upper bound from (4.41) together with the condition from (A10) shows that (4.43) is uniformly bounded in and .
We use (LABEL:EQ:ConvExpMRF4) together with (4.43) as well as (4.41) to give a bound on (LABEL:EQ:ConvExpMRF2) (up to a universal multiplicative constant) as follows
[TABLE]
because is stationary. We can now repeat the calculations which lead to (4.43) to see that this last sum satisfies
[TABLE]
Relying once more on (4.41) and (A10) shows then that this integral is uniformly bounded in and . This shows that (LABEL:EQ:ConvExpMRF2) is of order . Consequently, (4.39) is of order , too. ∎
Appendix A McDiarmid inequalities for Marton couplings
In this section we study McDiarmid inequalities for Marton couplings. Notable contributions to this topic are Samson (2000), Chazottes et al. (2007), Kontorovich and Ramanan (2008), Redig and Chazottes (2009). We shall first state a result of Paulin (2015) who uses Marton couplings to characterize the dependence of the data.
Definition A.1** (Partition).**
A partition of a random vector is a deterministic division of into random variables , for some such that the set is partitioned by . Denote the number of elements of by and write for the size of the partition which is .
Theorem A.2** (McDiarmid’s inequality, Paulin (2015)).**
Let be a random variable in . Assume that admits a partitioning which allows a Marton coupling with mixing matrix . Let be Lipschitz continuous w.r.t. the Hamming distance, i.e., there is a such that
[TABLE]
Set and for . Then
[TABLE]
In particular,
[TABLE]
The proof uses the following lemma of Devroye and Lugosi (2012):
Lemma A.3**.**
Let be a sub--algebra, random variables which satisfy Moreover, are -measurable and . Then
[TABLE]
Proof of Theorem A.2.
We consider the natural filtration of the random vector , i.e., for and define for . Then is also Lipschitz continuous w.r.t. Hamming distance, more precisely,
[TABLE]
Set for . Moreover, define for
[TABLE]
And write for the conditional distribution of given , i.e.,
[TABLE]
Then, it follows with elementary calculations that
[TABLE]
Now let be arbitrary but fixed. Choose such that and . Next, we use the Marton coupling of to obtain
[TABLE]
And as was arbitrary,
[TABLE]
Moreover, we have
[TABLE]
and both the left- and the right-hand-side are -measurable. Consequently, using the lemma of Devroye and Lugosi (2012), we find that
[TABLE]
This establishes the claim in (A.2). The final result in (A.3) follows from the inequalities
∎
The next proposition is due to Fiebig (1993) and a consequence of Goldstein’s maximal coupling, Goldstein (1979). See also Paulin (2015) Proposition 2.6 and Samson (2000) Proposition 2.
Proposition A.4** (Fiebig (1993), p. 482, (2.1)).**
Let and be two probability distributions on some common Polish space both admitting a strictly positive density w.r.t. to a measure . Then there is a coupling of random vectors , such that , and
[TABLE]
Acknowledgments
The author thanks an anonymous referee whose careful reading and detailed reports improved the manuscript considerably. This research was supported by the German Research Foundation (DFG), Grant Number KR-4977/1-1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bobrowski and Kahle (2018) O. Bobrowski and M. Kahle. Topology of random geometric complexes: a survey. Journal of Applied and Computational Topology , 1(3):331–364, 2018.
- 2Boissonnat et al. (2018) J.-D. Boissonnat, F. Chazal, and M. Yvinec. Geometric and Topological Inference , volume 57. Cambridge University Press, 2018.
- 3Bubenik (2015) P. Bubenik. Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. , 16(1):77–102, 2015.
- 4Carlsson (2009) G. Carlsson. Topology and data. Bulletin of the American Mathematical Society , 46(2):255–308, 2009.
- 5Chalker et al. (1999) T. Chalker, A. Godbole, P. Hitczenko, J. Radcliff, and O. Ruehr. On the size of a random sphere of influence graph. Advances in Applied Probability , 31(3):596–609, 1999.
- 6Chazal and Divol (2018) F. Chazal and V. Divol. The density of expected persistence diagrams and its kernel based estimation. In LIP Ics-Leibniz International Proceedings in Informatics , volume 99. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2018.
- 7Chazal and Michel (2017) F. Chazal and B. Michel. An introduction to topological data analysis: fundamental and practical aspects for data scientists. ar Xiv preprint ar Xiv:1710.04019 , 2017.
- 8Chazal et al. (2014) F. Chazal, B. T. Fasy, F. Lecci, A. Rinaldo, and L. Wasserman. Stochastic convergence of persistence landscapes and silhouettes. In Proceedings of the 30th Annual Symposium on Computational Geometry , pages 474–483, 2014.
