On the tightness of graph-based statistics
Lynna Chu, Hao Chen

TL;DR
This paper proves the tightness of graph-based stochastic processes with potential discontinuities, using higher moments analysis, applicable to various graph types including dense graphs, to facilitate convergence results.
Contribution
It introduces an alternative method to establish tightness via higher moments bounds for graph-based statistics, overcoming intractability of classic approaches.
Findings
Established tightness of graph-based processes with discontinuities.
Derived explicit formulas for higher moments of graph-based statistics.
Applicable to a wide range of graphs, including dense graphs.
Abstract
We establish tightness of graph-based stochastic processes in the space with that allows for discontinuities of the first kind. The graph-based stochastic processes are based on statistics constructed from similarity graphs. In this setting, the classic characterization of tightness is intractable, making it difficult to obtain convergence of the limiting distributions for graph-based stochastic processes. We take an alternative approach and study the behavior of the higher moments of the graph-based test statistics. We show that, under mild conditions of the graph, tightness of the stochastic process can be established by obtaining upper bounds on the graph-based statistics' higher moments. Explicit analytical expressions for these moments are provided. The results are applicable to generic graphs, including dense graphs where the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Probability and Risk Models · Bayesian Modeling and Causal Inference
On the Tightness of Graph-based Statistics
Lynna Chulabel=e1][email protected] [
Hao Chenlabel=e2][email protected] [ Department of Statistics, Iowa State University,
Department of Statistics, University of California, Davis,
Abstract
We establish tightness of graph-based stochastic processes in the space with that allows for discontinuities of the first kind. The graph-based stochastic processes are based on statistics constructed from similarity graphs. In this setting, the classic characterization of tightness is intractable, making it difficult to obtain convergence of the limiting distributions for graph-based stochastic processes. We take an alternative approach and study the behavior of the higher moments of the graph-based test statistics. We show that, under mild conditions of the graph, tightness of the stochastic process can be established by obtaining upper bounds on the graph-based statistics’ higher moments. Explicit analytical expressions for these moments are provided. The results are applicable to generic graphs, including dense graphs where the number of edges can be of higher order than the number of observations.
60G99,
60C05,
change-point,
graph-based tests,
nonparametric,
scan statistic,
Gaussian process,
tightness,
network data,
non-Euclidean data,
keywords:
[class=MSC]
keywords:
\startlocaldefs\endlocaldefs
and
1 Introduction
Change-point detection aims to estimate and test for the presence of change-points, locations where the distribution abrupt changes, in a sequence of observations. Research interest in change-point problems has surged in recent years and substantial contributions by the statistics community have resulted in a range of works (Aue et al., 2009; Zhang et al., 2010; Frick, Munk and Sieling, 2014; Garreau and Arlot, 2018; Wang and Samworth, 2018; Zou, Wang and Li, 2020; Wang, Yu and Rinaldo, 2021). In particular, an area of emphasis has been given to handling complex data types such as high-dimensional data or non-Euclidean data objects, including networks and images. Most change-point methods targeting complex data types are non-parametric and aim to make minimal assumptions on the underlying data generating mechanism in order to be widely applicable without restrictive assumptions (see Harchaoui, Moulines and Bach (2009); Matteson and James (2014); Shi, Wu and Rao (2018); Dubey and Müller (2020) and references therein). An obstacle for non-parametric works is that theoretical guarantees can pose immense challenges. For example, fast type I error control via analytical -value approximations are generally difficult to work out in the non-parametric setting. While the increasing complexity and volume of modern datasets necessitate methods that can offer fast ways to assess changes while controlling type I error, most non-parametric approaches still depend on re-sampling techniques to obtain -value approximations.
Recently, a graph-based framework for change-point detection was proposed in Chen and Zhang (2015) and further studied in Chu and Chen (2019) that aims to address the needs of modern change-point applications by offering flexibility and fast type I error control. The framework is a non-parametric approach that utilizes test statistics constructed from similarity graphs and is applicable to any data type, including multivariate and object data, as long as a similarity measure can be defined on the sample space. The similarity graph can be provided by domain knowledge or it can be generated according to some criteria, such as the minimum spanning tree or the nearest neighbor graph. This flexibility makes the approach applicable to a broad range of problems. Moreover, simulation studies and real data applications demonstrate that the approach is powerful under many settings involving high-dimensional and non-Euclidean data types (Chen and Zhang, 2015; Chu and Chen, 2019).
The graph-based framework is also equipped with analytical -value approximations for testing the significance of change-points. This extends the graph-based frameworks applicability to settings where the volume or complexity of the observations make it computationally infeasible to assess significance. A key step in obtaining these analytical -value approximations is proving, under certain regularity conditions, that the stochastic processes of the graph-based test statistics converge to Gaussian processes in finite dimensional distribution (see Theorem 3.1 in Chen and Zhang (2015) and Theorem 4.1 in Chu and Chen (2019)). Notably, the existing theorems do not imply convergence in distribution to Gaussian processes since tightness of the processes is not established. Tightness guarantees the existence of limit points for weak convergence and it ensures that intervals between the time points considered in the finite-dimensional distribution are well-behaved. This is essential for the type of test statistic, the maximum scan statistic, used in this framework (see (6) below).
In this paper, we establish tightness of the stochastic processes for graph-based test statistics under mild conditions of the graph. In terms of theoretical work, our proof provides the final piece in establishing the limiting distribution of these graph-based processes. To do so, we derive explicit expressions for higher product moments of graph-based test statistics which are obtained by studying configurations of the graph and combinatorial analysis. Importantly, our results hold for any generic graph, including dense graphs, and can be generalized to other graph-based stochastic processes to establish weak convergence. In terms of practical applications, our results provide further confidence in utilizing the asymptotic -value approximations for modern data applications and the testing of change-points.
The paper is organized as follows: Section 2 provides a brief overview of the graph-based framework. The main results are given in Section 3 and and the proof is provided in Section 4, with additional details in the Supplementary Material.
2 Review of the graph-based framework
Let be a data sequence indexed by time or some other meaningful ordering, where could be a high-dimensional observation or non-Euclidean object. In the single change-point setting, there possibly exists a change-point such that follows some unknown distribution for and follows a different (unknown) distribution for . Consider that each time divides the sequence of observations into two samples: those observations before time and those observations after time . The graph-based framework utilizes graph-based two-sample test statistics to test whether or not these two samples are from the same distribution. By graph-based two-sample tests we refer to tests that are based on graphs with the observations as nodes. The graph, , is constructed from all observations in the sequence and is usually derived from a distance or a generalized dissimilarity on the sample space, with edges in the graph connecting observations that are “close” in some sense. For example, could be the minimum spanning tree (MST), which is a tree connecting all observations such that the sum of the distances of edges in the tree is minimized; could also be the nearest neighbor graph (NNG) where each observation connects to its nearest neighbors. Four statistics are considered in Chen and Zhang (2015) and Chu and Chen (2019). These are based on 3 quantities of the graph which we briefly discuss below.
For any event let be the indicator function that takes if is true and [math] otherwise. We define as an indicator function for the event that is observed after , . For an edge , we define
[TABLE]
For any candidate value of , the three quantities are:
[TABLE]
Then is the number of edges connecting observations before and after , is the number of edges connecting observations prior to , and is the number of edges that connect observations after .
The four statistics considered are the edge-count test statistic (2), generalized edge-count test statistic (3), weighted edge-count test statistic (4), and max-type edge-count test statistic (5):
[TABLE]
[TABLE]
[TABLE]
with ,
[TABLE]
where
The expected value and variance of the four test statistics are computed under the permutation null distribution and their explicit expressions can be found in Chen and Zhang (2015) and Chu and Chen (2019). Each of the test statistics has its own niche where it dominates; a detailed discussion can be found in Chu and Chen (2019).
The null hypothesis of no change-point is rejected when the maximum scan statistic
[TABLE]
is greater than a threshold with and being pre-specified constraints controlling where we search for the change-point. When is small, this threshold can be obtained from permutation directly. However, this becomes computationally expensive for large and instead, Chen and Zhang (2015) and Chu and Chen (2019) provide accurate analytical formulas to approximate the -values for these scan statistics.
2.1 Notation
Let denote that is bounded above by (up to a constant) asymptotically and denote that is dominated by asymptotically. We also write to denote that is bounded above and below by , asymptotically; this will also be notated as .
3 Tightness of basic processes
3.1 Asymptotic null distributions of the basic processes
Given the scan statistics, we reject the null hypothesis of no change-point if the scan statistic is larger than a threshold. Explicitly, we are interested in the following tail probabilities: and
To obtain analytical approximations of these tail probabilities, Chen and Zhang (2015) and Chu and Chen (2019) studied the properties of the stochastic processes and under the null hypothesis. Based on Lemma 3.1 in Chu and Chen (2019), can be expressed as , where and are uncorrelated. Furthermore, can be expressed as
[TABLE]
where , , and and are defined as in (4). Therefore, these stochastic processes boil down to the basic processes: and .
In order to show that the limiting distributions of the basic processes converge to Gaussian processes, the classic approach as presented in Billingsley (1968) is to establish:
The convergence of to multivariate Gaussian in finite dimensional distributions. 111Throughout the paper, we use to denote the largest integer that is no larger than x. 2. 2.
The tightness of .
The first point has been proven in Chen and Zhang (2015) and Chu and Chen (2019). We prove here that the second point, tightness of the graph-based stochastic processes, does indeed hold under mild conditions for the graph.
3.2 Main Results
We first state our main results and then give an outline of the proof. We use to denote both the graph and its sets of edges. Let be the subgraph of containing all the edges that connect to node . Then, is the number of edges in of the node degree of in . The these results hold for generic similarity graphs, including dense graphs. We refer to a graph as dense if the number of edges is of higher order than the number of observations, i.e. if such that .
Theorem 3.1**.**
Under the condition that is at least and , the stochastic process is tight on the space , where is a positive constant.
Theorem 3.2**.**
Under the condition that is at least and is at least , the stochastic process is tight on the space , where is a positive constant.
These conditions are more relaxed than the conditions in Chen and Zhang (2015) and Chu and Chen (2019) when obtaining convergence in finite dimensional distributions.
Let be the space of real functions on that are right-continuous and have left-hand limits:
- (i)
For , exists and . 2. (ii)
For , .
Functions satisfying these two properties are known as cadlag functions. A function is said to have a discontinuity of the first kind at if the left and right limits exist but differ and lies between them. Any discontinuities of a cadlag function, an element of , are of the first kind. Since
[TABLE]
[TABLE]
it follows that and are right-continuous and have left-hand limits and therefore belong to the space .
The classical characterization of tightness on the space is given by Theorem 13.2 in Billingsley (1968), a version of which is presented here:
*A sequence of stochastic processes in is tight if and only if: *
- (i)
The sequence is stochastically bounded in , 2. (ii)
For each ,
[TABLE]
where
[TABLE]
In general these conditions are difficult to verify, since they involve understanding the limit supreme of a sequence. We instead take an alternative approach and use the tightness criterion proposed by Kolmogorov-Chentsov (Chentsov (1956), Theorem 1); a variant can also be found in Billingsley (1968). The criterion is as follows:
A sequence of stochastic processes , right continuous with left-hand limits, is tight if there are positive constants not depending on such that for any ,
[TABLE]
We set so the condition becomes:
[TABLE]
[TABLE]
where the notation and .
Both inequalities automatically hold when since at least one of the following is true: (i) , (ii) . In what follows, we focus on the case when
Observe that and are not well-defined at the boundaries, when or . We further assume that and therefore, cannot be too close to the boundaries. As such, we establish tightness on the domain , where is a positive constant. The proof of this result involves obtaining explicit expressions for the th moments and product moments of and using combinatorial analysis. This involves determining the different graph configurations for 4 edges to be randomly selected (with replacement) from the graph and obtaining the probabilities that each configuration will occur for the graph. Focusing on the leading terms of each configuration, we show these are bounded by .
4 Proof of Theorems 3.1 and 3.2
For simplicity, let , , and and . Then, expanding (7), we have
[TABLE]
and similarly for (8).
For the two basic processes, the following analytical expressions are needed for
[TABLE]
[TABLE]
and the following analytical expressions are needed for
[TABLE]
[TABLE]
It is straightforward to see that all the expressions can be decomposed as combinations of and . Since explicit expressions for the expectation, variance, and third moments of , , and can be found in Chen and Zhang (2015) and Chu and Chen (2019), the remaining unknown quantities to be derived are the product moments of and , which can be expressed as
[TABLE]
where such that and . The full list of product moments can be found in the Supplement A.
To derive the analytical expressions for the product moments we need to:
Determine different configurations for 4 edges to be randomly selected (with replacement) from the graph, 2. 2.
Derive probabilities separately for each configuration.
There are in total nineteen different configurations for four edges randomly chosen (with replacement) from the graph; see Figure 1 for an illustration of each configuration.
Let be the similarity graph and be the subgraph of containing all edges that connect to node . Then is the degree of node in . Among all possible ways of randomly selecting the four edges, the number of occurrences for each of the configuration are:
2. 2)
3. 3)
4. 4)
5. 5)
6. 6)
7. 7)
8. 8)
9. 9)
10. 10)
11. 11)
12. 12)
13. 13)
14. 14)
15. 15)
16. 16)
17. 17)
18. 18)
19. 19)
with defined as:
[TABLE]
We will use two examples ( and ) to illustrate how to derive the probability for each configuration. The remaining product moments can be obtained in a similar way. The explicit formulas for all the product moments can be found in Supplement A.
Example 1: To derive the probability of each configuration for (Supplement (S62)) , observe that
[TABLE]
We derive for each of the 19 configurations separately.
The four edges are actually the same edge.
[TABLE] 2. 2)
Three edges are the same and share one node with the fourth edge or two pairs of the edges are the same and share one node.
[TABLE] 3. 3)
Three edges are the same and do not share any node with the fourth edge or two pairs of the edges are the same and do not share any node with each other.
[TABLE] 4. 4)
Two edges are the same and share one node with the other two edges. None of them share the other node (star-shaped configuration).
[TABLE] 5. 5)
Linear chain of edges such that one edge shares one node with another edge and the share the other node with the third edge. The fourth edge can be the same as any of the other three edges.
[TABLE] 6. 6)
Two edges are the same and the edges form a triangle.
[TABLE] 7. 7)
Two edges share one node and do not share any node with the third edge. The fourth edge can be the same as any of the other three edges.
[TABLE] 8. 8)
Two edges are the same and no pair of edges share any node.
[TABLE] 9. 9)
The four edges share one node, and none of them share the other node (star-shaped).
[TABLE] 10. 10)
Linear chain of edges such that two distinct edges share one node with the other two edges and share a node with each other other.
[TABLE] 11. 11)
All four edges form a box.
[TABLE] 12. 12)
Three edges form a triangle and one edge connects to one node of the triangle.
[TABLE] 13. 13)
Three edges share the same node and the fourth edge shares the other node of one of the edges.
[TABLE] 14. 14)
Three edges share the same node and the fourth edge does not share any node with the other edges.
[TABLE] 15. 15)
Three edges form a linear chain and the fourth edge does not share any node with the other edges.
[TABLE] 16. 16)
Three edges form a triangle and the fourth edge does not share any node with the other edges.
[TABLE] 17. 17)
Two pairs of edges share one node with each other. The pairs of edges do not share any nodes with each other.
[TABLE] 18. 18)
Two edges share one node with each other. The other edges do not share any nodes with any of the other edges.
[TABLE] 19. 19)
None of the four edges share any node.
[TABLE]
Example 2: To derive the probability of each configuration for (Supplement (S84)) , observe that
[TABLE]
We derive for each of the 19 configurations separately.
The four edges are actually the same edge.
[TABLE] 2. 2)
Three edges are the same and share one node with the fourth edge or two pairs of the edges are the same and share one node.
[TABLE] 3. 3)
Three edges are the same and do not share any node with the fourth edge or two pairs of the edges are the same and do not share any node with each other.
[TABLE] 4. 4)
Two edges are the same and share one node with the other two edges. None of them share the other node (star-shaped configuration).
[TABLE] 5. 5)
Linear chain of edges such that one edge shares one node with another edge and the share the other node with the third edge. The fourth edge can be the same as any of the other three edges.
[TABLE] 6. 6)
Two edges are the same and the edges form a triangle.
[TABLE] 7. 7)
Two edges share one node and do not share any node with the third edge. The fourth edge can be the same as any of the other three edges.
[TABLE] 8. 8)
Two edges are the same and no pair of edges share any node.
[TABLE] 9. 9)
The four edges share one node, and none of them share the other node (star-shaped).
[TABLE] 10. 10)
Linear chain of edges such that two distinct edges share one node with the other two edges and share a node with each other other.
[TABLE] 11. 11)
All four edges form a box.
[TABLE] 12. 12)
Three edges form a triangle and one edge connects to one node of the triangle.
[TABLE] 13. 13)
Three edges share the same node and the fourth edge shares the other node of one of the edges.
[TABLE] 14. 14)
Three edges share the same node and the fourth edge does not share any node with the other edges.
[TABLE] 15. 15)
Three edges form a linear chain and the fourth edge does not share any node with the other edges.
[TABLE] 16. 16)
Three edges form a triangle and the fourth edge does not share any node with the other edges.
[TABLE] 17. 17)
Two pairs of edges share one node with each other. The pairs of edges do not share any nodes with each other.
[TABLE] 18. 18)
Two edges share one node with each other. The other edges do not share any nodes with any of the other edges.
[TABLE] 19. 19)
None of the four edges share any node.
[TABLE]
For the remaining expressions, similar derivations using combinatorial analysis can be obtained.
4.1 Expression for
The similarity graph can be a generic graph constructed from a similarity measure, such as the Euclidean distance. Without loss of generality, with . We assume that . To establish (7), we focus on the leading terms on the left-hand side of the inequality. After extensive simplification, the leading term for the denominator of is
[TABLE]
The leading term for the numerator is:
[TABLE]
with
[TABLE]
Since for , the expression can be bounded by as long as the ratio of graph configurations in the numerator and denominator can be bounded asymptotically by . Specifically, since , the terms can be bounded asymptotically by a constant. The remaining terms in the numerator involve configurations of the graph: , , , , , , , , , and . If the ratio of each of these terms with the denominator’s is bounded by , then the entire expression can be asymptotically bounded by a constant times .
In the following, we assume that and we check each configuration (in their order of appearance).
Clearly .
For , we have
[TABLE]
Then and . Following similar arguments, since , we have and .
For , we have
[TABLE]
Since the the largest can be is (every other observation connects to node ), it follows that and .
Similarly, since , we have
We have , and so
Finally, since
[TABLE]
it follows that the ratio of the these configurations with are bounded asymptotically by .
4.2 Expression for
We adopt a similar approach for : we study the analytical expression for . This expression can be written as the combination of terms involving and and terms involving configurations from the graph. We first show that the expressions involving and can be bounded by or . We then show that the graph-configurations are bounded asymptotically by or . It follows then that the entire expression can be bounded by a constant times .
Let , , and . The leading term for the denominator of is:
[TABLE]
with .
For the numerator of , we group the leading terms by their graph configurations. The numerator can be expressed as
[TABLE]
We first show that the coefficients , and can be bounded by or .
: The leading coefficient for can be expanded as
[TABLE]
with
[TABLE]
It is clear that since and can be chosen to be large enough such that . In the following we focus on the next two terms. For the third term, we need to show that . Let and define
[TABLE]
which is continuous everywhere on .
If is convex for , it follows that . Since and , what remains is to check its second derivative is non-negative:
[TABLE]
Since we have established that is convex, it follows that and . Moreover, the minimum of is achieved when and , for . Therefore .
Following a similar argument, we can establish that . Let . We have and . Its first and second derivatives are
[TABLE]
and therefore . Since , it follows that . Note that .
Therefore, for some constant . 2. 2.
: The leading coefficient for is
[TABLE]
with
[TABLE]
Since , we have for some constant . In order to show that remaining terms can also be bounded by , we follow that same argument detailed above for . Observe that and . It follows that terms with and of can be by bounded by as well.
Finally, for the last term in , we see that and .
It follows that for some constant . 3. 3.
: The leading coefficient for is
[TABLE]
with
[TABLE]
Again, the first two terms involving and can be bounded by . Repeating the convexity argument, , which allow us to bound the remaining terms by as well. Therefore, the entire expression can also be bounded by . 4. 4.
: The leading coefficient for is
[TABLE]
with
[TABLE]
The first two terms involving , , and can be bounded by . Repeating a combination of the convexity arguments from above, the remaining terms can also be bounded by . It follows that . 5. 5.
: The leading coefficient for is
[TABLE]
with
[TABLE]
and utilizing the arguments above, this term is also bounded by . 6. 6.
: The leading coefficient for is
[TABLE]
with
[TABLE]
We have that can be bounded by a constant and by convexity, . Therefore, the leading coefficient is bounded by .
Although we have established that the coefficients can be bounded, in order for the entire expression to be bounded by we need the graph configurations in the numerator and denominator to be bounded by or . Recall that the leading term is the denominator is . Let , then . The graph configurations in the numerator involve:
2. 2.
3. 3.
4. 4.
5. 5.
6. 6.
Let . Suppose the largest (centered) degree , where .
We first focus on the second configuration 2 in the numerator, we have:
[TABLE]
Since , it follows that the entire expression .
In the denominator, if , then , and . Then the ratio of the numerator 2 and denominator gives us
[TABLE]
If , then . With the assumption that , we have . Other terms can be done in a similar way. Notice that:
. 2. 3.
3. 4.
. 4. 5.
5. 6.
.
Therefore, the ratio of the first 5 configurations can be bounded by and the 6th configuration can be bounded by . To see that the 6th configuration can be bounded by , consider that if , then and the ratio of the numerator and denominator is . If , then and the ratio becomes Recall that expression for can be expressed as the linear combination of the leading coefficients multiplied by their respective graph configurations. We have established that are bounded by and is bounded by . Combining these results, and that we are considering the case that , it follows that the expression for can be bounded by .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aue et al. (2009) {barticle} [author] \bauthor \bsnm Aue, \bfnm Alexander \binits A., \bauthor \bsnm Hörmann, \bfnm Siegfried \binits S., \bauthor \bsnm Horváth, \bfnm Lajos \binits L., \bauthor \bsnm Reimherr, \bfnm Matthew \binits M. \betal et al. ( \byear 2009). \btitle Break detection in the covariance structure of multivariate time series models. \bjournal The Annals of Statistics \bvolume 37 \bpages 4046–4087. \endbibitem
- 2Billingsley (1968) {barticle} [author] \bauthor \bsnm Billingsley, \bfnm Patrick \binits P. ( \byear 1968). \btitle Convergence of Probability Measures. \endbibitem
- 3Chen and Zhang (2015) {barticle} [author] \bauthor \bsnm Chen, \bfnm Hao \binits H. and \bauthor \bsnm Zhang, \bfnm Nancy \binits N. ( \byear 2015). \btitle Graph-based change-point detection. \bjournal The Annals of Statistics \bvolume 43 \bpages 139–176. \endbibitem
- 4Chentsov (1956) {barticle} [author] \bauthor \bsnm Chentsov, \bfnm Nikolai N \binits N. N. ( \byear 1956). \btitle Weak convergence of stochastic processes whose trajectories have no discontinuities of the second kind and the “heuristic” approach to the Kolmogorov-Smirnov tests. \bjournal Theory of Probability & Its Applications \bvolume 1 \bpages 140–144. \endbibitem
- 5Chu and Chen (2019) {barticle} [author] \bauthor \bsnm Chu, \bfnm Lynna \binits L. and \bauthor \bsnm Chen, \bfnm Hao \binits H. ( \byear 2019). \btitle Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data. \bjournal The Annals of Statistics \bvolume 47 \bpages 382–414. \endbibitem
- 6Dubey and Müller (2020) {barticle} [author] \bauthor \bsnm Dubey, \bfnm Paromita \binits P. and \bauthor \bsnm Müller, \bfnm Hans-Georg \binits H.-G. ( \byear 2020). \btitle Fréchet change-point detection. \bjournal The Annals of Statistics \bvolume 48 \bpages 3312–3335. \endbibitem
- 7Frick, Munk and Sieling (2014) {barticle} [author] \bauthor \bsnm Frick, \bfnm Klaus \binits K., \bauthor \bsnm Munk, \bfnm Axel \binits A. and \bauthor \bsnm Sieling, \bfnm Hannes \binits H. ( \byear 2014). \btitle Multiscale change point inference. \bjournal Journal of the Royal Statistical Society: Series B (Statistical Methodology) \bvolume 76 \bpages 495–580. \endbibitem
- 8Garreau and Arlot (2018) {barticle} [author] \bauthor \bsnm Garreau, \bfnm Damien \binits D. and \bauthor \bsnm Arlot, \bfnm Sylvain \binits S. ( \byear 2018). \btitle Consistent change-point detection with kernels. \bjournal Electronic Journal of Statistics \bvolume 12 \bpages 4440–4486. \endbibitem
