On the tightness of graph-based statistics

Lynna Chu; Hao Chen

arXiv:2303.00136·math.PR·March 2, 2023

On the tightness of graph-based statistics

Lynna Chu, Hao Chen

PDF

Open Access

TL;DR

This paper proves the tightness of graph-based stochastic processes with potential discontinuities, using higher moments analysis, applicable to various graph types including dense graphs, to facilitate convergence results.

Contribution

It introduces an alternative method to establish tightness via higher moments bounds for graph-based statistics, overcoming intractability of classic approaches.

Findings

01

Established tightness of graph-based processes with discontinuities.

02

Derived explicit formulas for higher moments of graph-based statistics.

03

Applicable to a wide range of graphs, including dense graphs.

Abstract

We establish tightness of graph-based stochastic processes in the space $D [0 + ϵ, 1 - ϵ]$ with $ϵ > 0$ that allows for discontinuities of the first kind. The graph-based stochastic processes are based on statistics constructed from similarity graphs. In this setting, the classic characterization of tightness is intractable, making it difficult to obtain convergence of the limiting distributions for graph-based stochastic processes. We take an alternative approach and study the behavior of the higher moments of the graph-based test statistics. We show that, under mild conditions of the graph, tightness of the stochastic process can be established by obtaining upper bounds on the graph-based statistics' higher moments. Explicit analytical expressions for these moments are provided. The results are applicable to generic graphs, including dense graphs where the number of…

Equations389

J_{e} (t) = ⎩ ⎨ ⎧ 0 if g_{i} (t) \neq = g_{j} (t) 1 if g_{i} (t) = g_{j} (t) = 0, 2 if g_{i} (t) = g_{j} (t) = 1.

J_{e} (t) = ⎩ ⎨ ⎧ 0 if g_{i} (t) \neq = g_{j} (t) 1 if g_{i} (t) = g_{j} (t) = 0, 2 if g_{i} (t) = g_{j} (t) = 1.

R_{0} (t) = e \in G \sum I_{J_{e} (t) = 0}, R_{1} (t) = e \in G \sum I_{J_{e} (t) = 1}, R_{2} (t) = e \in G \sum I_{J_{e} (t) = 2} .

R_{0} (t) = e \in G \sum I_{J_{e} (t) = 0}, R_{1} (t) = e \in G \sum I_{J_{e} (t) = 1}, R_{2} (t) = e \in G \sum I_{J_{e} (t) = 2} .

Z (t) = - \frac{R _{0} ( t ) - E ( R _{0} ( t ))}{Var ( R _{0} ( t ))},

Z (t) = - \frac{R _{0} ( t ) - E ( R _{0} ( t ))}{Var ( R _{0} ( t ))},

S (t) = (R_{1} (t) - E (R_{1} (t)) R_{2} (t) - E (R_{2} (t)))^{T} Σ^{- 1} (t) (R_{1} (t) - E (R_{1} (t)) R_{2} (t) - E (R_{2} (t))) .

S (t) = (R_{1} (t) - E (R_{1} (t)) R_{2} (t) - E (R_{2} (t)))^{T} Σ^{- 1} (t) (R_{1} (t) - E (R_{1} (t)) R_{2} (t) - E (R_{2} (t))) .

Z_{w} (t) = \frac{R _{w} ( t ) - E ( R _{w} ( t ))}{Var ( R _{w} ( t ))},

Z_{w} (t) = \frac{R _{w} ( t ) - E ( R _{w} ( t ))}{Var ( R _{w} ( t ))},

M (t) = max (∣ Z_{diff} (t) ∣, Z_{w} (t)),

M (t) = max (∣ Z_{diff} (t) ∣, Z_{w} (t)),

n_{0} \leq t \leq n_{1} max Z_{0} (t), n_{0} \leq t \leq n_{1} max Z_{w} (t), n_{0} \leq t \leq n_{1} max S (t), n_{0} \leq t \leq n_{1} max M (t)

n_{0} \leq t \leq n_{1} max Z_{0} (t), n_{0} \leq t \leq n_{1} max Z_{w} (t), n_{0} \leq t \leq n_{1} max S (t), n_{0} \leq t \leq n_{1} max M (t)

Z (t) = \frac{2 σ _{R_{w}}}{4 σ _{R_{w}}^{2} + ( p ( t ) - q ( t ) ) ^{2} σ _{R_{diff}}^{2}} \times Z_{w} (t) + \frac{( p ( t ) - q ( t )) σ _{R_{diff}}}{4 σ _{R_{w}}^{2} + ( p ( t ) - q ( t ) ) ^{2} σ _{R_{diff}}^{2}} \times Z_{diff} (t),

Z (t) = \frac{2 σ _{R_{w}}}{4 σ _{R_{w}}^{2} + ( p ( t ) - q ( t ) ) ^{2} σ _{R_{diff}}^{2}} \times Z_{w} (t) + \frac{( p ( t ) - q ( t )) σ _{R_{diff}}}{4 σ _{R_{w}}^{2} + ( p ( t ) - q ( t ) ) ^{2} σ _{R_{diff}}^{2}} \times Z_{diff} (t),

u ↓ c lim Z_{w} (⌊ n u ⌋) = Z_{w} (⌊ n c ⌋), u ↑ c lim Z_{w} (⌊ n u ⌋) = Z_{w} (⌊ n u ⌋),

u ↓ c lim Z_{w} (⌊ n u ⌋) = Z_{w} (⌊ n c ⌋), u ↑ c lim Z_{w} (⌊ n u ⌋) = Z_{w} (⌊ n u ⌋),

u ↓ c lim Z_{diff} (⌊ n u ⌋) = Z_{diff} (⌊ n c ⌋) u ↑ c lim Z_{diff} (⌊ n u ⌋) = Z_{diff} (⌊ n u ⌋),

u ↓ c lim Z_{diff} (⌊ n u ⌋) = Z_{diff} (⌊ n c ⌋) u ↑ c lim Z_{diff} (⌊ n u ⌋) = Z_{diff} (⌊ n u ⌋),

δ lim n lim sup P (ω^{'} (X^{n}, δ) > ϵ) = 0,

δ lim n lim sup P (ω^{'} (X^{n}, δ) > ϵ) = 0,

ω^{'} (x, δ) = t_{i} in f i max s, t \in [t_{i - 1}, t_{i}) sup ∣ x (s) - x (t) ∣.

ω^{'} (x, δ) = t_{i} in f i max s, t \in [t_{i - 1}, t_{i}) sup ∣ x (s) - x (t) ∣.

E (∣ X^{n} (v) - X^{n} (u) ∣^{2 β} ∣ X^{n} (w) - X^{n} (v) ∣^{2 β}) \leq C (w - u)^{1 + α} .

E (∣ X^{n} (v) - X^{n} (u) ∣^{2 β} ∣ X^{n} (w) - X^{n} (v) ∣^{2 β}) \leq C (w - u)^{1 + α} .

E ((Z_{w}^{n} (v) - Z_{w}^{n} (u))^{2} (Z_{w}^{n} (w) - Z_{w}^{n} (v))^{2}) \leq C_{w} (w - u)^{2}

E ((Z_{w}^{n} (v) - Z_{w}^{n} (u))^{2} (Z_{w}^{n} (w) - Z_{w}^{n} (v))^{2}) \leq C_{w} (w - u)^{2}

E ((Z_{diff}^{n} (v) - Z_{diff}^{n} (u))^{2} (Z_{diff}^{n} (w) - Z_{diff}^{n} (v))^{2}) \leq C_{diff} (w - u)^{2}

E ((Z_{diff}^{n} (v) - Z_{diff}^{n} (u))^{2} (Z_{diff}^{n} (w) - Z_{diff}^{n} (v))^{2}) \leq C_{diff} (w - u)^{2}

E ((Z_{w}^{n} (v) -

E ((Z_{w}^{n} (v) -

E (Z_{w}^{2} (r) Z_{w}^{2} (s)) - 2 E (Z_{w}^{2} (r) Z_{w} (s) Z_{w} (t)) + E (Z_{w}^{2} (r) Z_{w}^{2} (t))

- 2 E (Z_{w} (r) Z_{w}^{3} (s)) + E (Z_{w}^{2} (s) Z_{w}^{2} (t)) - 2 E (Z_{w} (r) Z_{w} (s) Z_{w}^{2} (t))

+ E (Z_{w}^{4} (s)) - 2 E (Z_{w}^{3} (s) Z_{w} (t)) + 4 E (Z_{w} (r) Z_{w}^{2} (s) Z_{w} (t)) .

E (Z_{w}^{2} (r) Z_{w} (s) Z_{w} (t)),

E (Z_{w}^{2} (r) Z_{w} (s) Z_{w} (t)),

E (Z_{w} (r) Z_{w} (s) Z_{w}^{2} (t)),

E (Z_{w} (r) Z_{w}^{2} (s) Z_{w} (t)),

E (Z_{w}^{2} (r) Z_{w}^{2} (s)),

E (Z_{w}^{2} (r) Z_{w}^{2} (t)),

E (Z_{w}^{2} (s) Z_{w}^{2} (t)),

E (Z_{w}^{2} (s) Z_{w}^{2} (t)),

E (Z_{w} (r) Z_{w}^{3} (s)),

E (Z_{w}^{3} (s) Z_{w} (t)),

E (Z_{w}^{4} (s)),

E (Z_{diff}^{2} (r) Z_{diff} (s) Z_{diff} (t)),

E (Z_{diff}^{2} (r) Z_{diff} (s) Z_{diff} (t)),

E (Z_{diff} (r) Z_{diff} (s) Z_{diff}^{2} (t)),

E (Z_{diff} (r) Z_{diff}^{2} (s) Z_{diff} (t)),

E (Z_{diff}^{2} (r) Z_{diff}^{2} (s)),

E (Z_{diff}^{2} (r) Z_{diff}^{2} (t)),

E (Z_{diff}^{2} (s) Z_{diff}^{2} (t)),

E (Z_{diff}^{2} (s) Z_{diff}^{2} (t)),

E (Z_{diff} (r) Z_{diff}^{3} (s)),

E (Z_{diff}^{3} (s) Z_{diff} (t)),

E (Z_{diff}^{4} (s)) .

E (R_{1}^{a} (t_{1}^{⋆}) R_{2}^{b} (t_{2}^{⋆}) R_{1}^{c} (t_{3}^{⋆}) R_{2}^{d} (t_{4}^{⋆}))

E (R_{1}^{a} (t_{1}^{⋆}) R_{2}^{b} (t_{2}^{⋆}) R_{1}^{c} (t_{3}^{⋆}) R_{2}^{d} (t_{4}^{⋆}))

x_{1} =

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Probability and Risk Models · Bayesian Modeling and Causal Inference

Full text

On the Tightness of Graph-based Statistics

Lynna Chulabel=e1][email protected] [

Hao Chenlabel=e2][email protected] [ Department of Statistics, Iowa State University,

Department of Statistics, University of California, Davis,

Abstract

We establish tightness of graph-based stochastic processes in the space $D[0+\epsilon,1-\epsilon]$ with $\epsilon>0$ that allows for discontinuities of the first kind. The graph-based stochastic processes are based on statistics constructed from similarity graphs. In this setting, the classic characterization of tightness is intractable, making it difficult to obtain convergence of the limiting distributions for graph-based stochastic processes. We take an alternative approach and study the behavior of the higher moments of the graph-based test statistics. We show that, under mild conditions of the graph, tightness of the stochastic process can be established by obtaining upper bounds on the graph-based statistics’ higher moments. Explicit analytical expressions for these moments are provided. The results are applicable to generic graphs, including dense graphs where the number of edges can be of higher order than the number of observations.

60G99,

60C05,

change-point,

graph-based tests,

nonparametric,

scan statistic,

Gaussian process,

tightness,

network data,

non-Euclidean data,

keywords:

[class=MSC]

keywords:

\startlocaldefs\endlocaldefs

and

1 Introduction

Change-point detection aims to estimate and test for the presence of change-points, locations where the distribution abrupt changes, in a sequence of observations. Research interest in change-point problems has surged in recent years and substantial contributions by the statistics community have resulted in a range of works (Aue et al., 2009; Zhang et al., 2010; Frick, Munk and Sieling, 2014; Garreau and Arlot, 2018; Wang and Samworth, 2018; Zou, Wang and Li, 2020; Wang, Yu and Rinaldo, 2021). In particular, an area of emphasis has been given to handling complex data types such as high-dimensional data or non-Euclidean data objects, including networks and images. Most change-point methods targeting complex data types are non-parametric and aim to make minimal assumptions on the underlying data generating mechanism in order to be widely applicable without restrictive assumptions (see Harchaoui, Moulines and Bach (2009); Matteson and James (2014); Shi, Wu and Rao (2018); Dubey and Müller (2020) and references therein). An obstacle for non-parametric works is that theoretical guarantees can pose immense challenges. For example, fast type I error control via analytical $p$ -value approximations are generally difficult to work out in the non-parametric setting. While the increasing complexity and volume of modern datasets necessitate methods that can offer fast ways to assess changes while controlling type I error, most non-parametric approaches still depend on re-sampling techniques to obtain $p$ -value approximations.

Recently, a graph-based framework for change-point detection was proposed in Chen and Zhang (2015) and further studied in Chu and Chen (2019) that aims to address the needs of modern change-point applications by offering flexibility and fast type I error control. The framework is a non-parametric approach that utilizes test statistics constructed from similarity graphs and is applicable to any data type, including multivariate and object data, as long as a similarity measure can be defined on the sample space. The similarity graph can be provided by domain knowledge or it can be generated according to some criteria, such as the minimum spanning tree or the nearest neighbor graph. This flexibility makes the approach applicable to a broad range of problems. Moreover, simulation studies and real data applications demonstrate that the approach is powerful under many settings involving high-dimensional and non-Euclidean data types (Chen and Zhang, 2015; Chu and Chen, 2019).

The graph-based framework is also equipped with analytical $p$ -value approximations for testing the significance of change-points. This extends the graph-based frameworks applicability to settings where the volume or complexity of the observations make it computationally infeasible to assess significance. A key step in obtaining these analytical $p$ -value approximations is proving, under certain regularity conditions, that the stochastic processes of the graph-based test statistics converge to Gaussian processes in finite dimensional distribution (see Theorem 3.1 in Chen and Zhang (2015) and Theorem 4.1 in Chu and Chen (2019)). Notably, the existing theorems do not imply convergence in distribution to Gaussian processes since tightness of the processes is not established. Tightness guarantees the existence of limit points for weak convergence and it ensures that intervals between the time points considered in the finite-dimensional distribution are well-behaved. This is essential for the type of test statistic, the maximum scan statistic, used in this framework (see (6) below).

In this paper, we establish tightness of the stochastic processes for graph-based test statistics under mild conditions of the graph. In terms of theoretical work, our proof provides the final piece in establishing the limiting distribution of these graph-based processes. To do so, we derive explicit expressions for higher product moments of graph-based test statistics which are obtained by studying configurations of the graph and combinatorial analysis. Importantly, our results hold for any generic graph, including dense graphs, and can be generalized to other graph-based stochastic processes to establish weak convergence. In terms of practical applications, our results provide further confidence in utilizing the asymptotic $p$ -value approximations for modern data applications and the testing of change-points.

The paper is organized as follows: Section 2 provides a brief overview of the graph-based framework. The main results are given in Section 3 and and the proof is provided in Section 4, with additional details in the Supplementary Material.

2 Review of the graph-based framework

Let $\{\mathbf{y_{i}}:i=1,\ldots,n\}$ be a data sequence indexed by time or some other meaningful ordering, where $\mathbf{y}_{t}$ could be a high-dimensional observation or non-Euclidean object. In the single change-point setting, there possibly exists a change-point $\tau$ such that $\mathbf{y}_{t}$ follows some unknown distribution for $t\leq\tau$ and follows a different (unknown) distribution for $t>\tau$ . Consider that each time $t$ divides the sequence of observations into two samples: those observations before time $t$ and those observations after time $t$ . The graph-based framework utilizes graph-based two-sample test statistics to test whether or not these two samples are from the same distribution. By graph-based two-sample tests we refer to tests that are based on graphs with the observations $\{\textbf{y}_{i}\}$ as nodes. The graph, $G$ , is constructed from all observations in the sequence and is usually derived from a distance or a generalized dissimilarity on the sample space, with edges in the graph connecting observations that are “close” in some sense. For example, $G$ could be the minimum spanning tree (MST), which is a tree connecting all observations such that the sum of the distances of edges in the tree is minimized; $G$ could also be the nearest neighbor graph (NNG) where each observation connects to its nearest neighbors. Four statistics are considered in Chen and Zhang (2015) and Chu and Chen (2019). These are based on 3 quantities of the graph which we briefly discuss below.

For any event $x$ let $I_{x}$ be the indicator function that takes $1$ if $x$ is true and [math] otherwise. We define $g_{i}(t)$ as an indicator function for the event that $\mathbf{y}_{i}$ is observed after $t$ , $g_{i}(t)=I_{i>t}$ . For an edge $e=(i,j)$ , we define

[TABLE]

For any candidate value $t$ of $\tau$ , the three quantities are:

[TABLE]

Then $R_{0}(t)$ is the number of edges connecting observations before and after $t$ , $R_{1}(t)$ is the number of edges connecting observations prior to $t$ , and $R_{2}(t)$ is the number of edges that connect observations after $t$ .

The four statistics considered are the edge-count test statistic (2), generalized edge-count test statistic (3), weighted edge-count test statistic (4), and max-type edge-count test statistic (5):

[TABLE]

with $R_{w}(t)=p(t)R_{1}(t)+q(t)R_{2}(t),\quad\quad p(t)=\frac{n-t-1}{n-2},\quad q(t)=\frac{t-1}{n-2}$ ,

[TABLE]

where $Z_{\text{diff}}(t)=\frac{R_{\text{diff}}(t)-\mathbf{E}(R_{\text{diff}}(t))}{\sqrt{\text{\bf Var}(R_{\text{diff}}(t))}},\text{ with }R_{\text{diff}}(t)=R_{1}(t)-R_{2}(t).$

The expected value and variance of the four test statistics are computed under the permutation null distribution and their explicit expressions can be found in Chen and Zhang (2015) and Chu and Chen (2019). Each of the test statistics has its own niche where it dominates; a detailed discussion can be found in Chu and Chen (2019).

The null hypothesis of no change-point is rejected when the maximum scan statistic

[TABLE]

is greater than a threshold with $n_{0}$ and $n_{1}$ being pre-specified constraints controlling where we search for the change-point. When $n$ is small, this threshold can be obtained from permutation directly. However, this becomes computationally expensive for large $n$ and instead, Chen and Zhang (2015) and Chu and Chen (2019) provide accurate analytical formulas to approximate the $p$ -values for these scan statistics.

2.1 Notation

Let $f_{n}\precsim g_{n}$ denote that $f_{n}$ is bounded above by $g_{n}$ (up to a constant) asymptotically and $f_{n}=o(g_{n})$ denote that $f_{n}$ is dominated by $g_{n}$ asymptotically. We also write $f_{n}=O(g_{n})$ to denote that $f_{n}$ is bounded above and below by $g_{n}$ , asymptotically; this will also be notated as $f_{n}\asymp g_{n}$ .

3 Tightness of basic processes

3.1 Asymptotic null distributions of the basic processes

Given the scan statistics, we reject the null hypothesis of no change-point if the scan statistic is larger than a threshold. Explicitly, we are interested in the following tail probabilities: $P\left(\max_{n_{0}\leq t\leq n_{1}}Z(t)>b_{Z}\right),\\ P\left(\max_{n_{0}\leq t\leq n_{1}}S(t)>b_{S}\right),P\left(\max_{n_{0}\leq t\leq n_{1}}Z_{w}(t)>b_{Z_{w}}\right),$ and $P\left(\max_{n_{0}\leq t\leq n_{1}}M(t)>b_{M}\right).$

To obtain analytical approximations of these tail probabilities, Chen and Zhang (2015) and Chu and Chen (2019) studied the properties of the stochastic processes $\{Z(t)\},\{S(t)\},\{Z_{w}(t)\},$ and $\{M(t)\}$ under the null hypothesis. Based on Lemma 3.1 in Chu and Chen (2019), $S(t)$ can be expressed as $S(t)=Z^{2}_{w}(t)+Z_{\text{diff}}(t)$ , where $Z_{w}(t)$ and $Z_{\text{diff}}(t)$ are uncorrelated. Furthermore, $Z(t)$ can be expressed as

[TABLE]

where $\sigma^{2}_{R_{w}}=\text{\bf Var}(R_{w}(t))$ , $\sigma^{2}_{R_{\text{diff}}}=\text{\bf Var}(R_{\text{diff}}(t))$ , and $p(t)$ and $q(t)$ are defined as in (4). Therefore, these stochastic processes boil down to the basic processes: $\{Z_{\text{diff}}(t)\},$ and $\{Z_{w}(t)\}$ .

In order to show that the limiting distributions of the basic processes converge to Gaussian processes, the classic approach as presented in Billingsley (1968) is to establish:

The convergence of $\{Z_{w}(\lfloor nu\rfloor):0<u<1\},\text{ and }\{Z_{\text{diff}}(\lfloor nu\rfloor):0<u<1\}$ to multivariate Gaussian in finite dimensional distributions. 111Throughout the paper, we use $\lfloor x\rfloor$ to denote the largest integer that is no larger than x. 2. 2.

The tightness of $\{Z_{w}(\lfloor nu\rfloor):0<u<1\}\text{ and }\{Z_{\text{diff}}(\lfloor nu\rfloor):0<u<1\}$ .

The first point has been proven in Chen and Zhang (2015) and Chu and Chen (2019). We prove here that the second point, tightness of the graph-based stochastic processes, does indeed hold under mild conditions for the graph.

3.2 Main Results

We first state our main results and then give an outline of the proof. We use $G$ to denote both the graph and its sets of edges. Let $G_{i}$ be the subgraph of $G$ containing all the edges that connect to node $\mathbf{y}_{i}$ . Then, $|G_{i}|$ is the number of edges in $G_{i}$ of the node degree of $\mathbf{y}_{i}$ in $G$ . The these results hold for generic similarity graphs, including dense graphs. We refer to a graph as dense if the number of edges is of higher order than the number of observations, i.e. if $|G|=O(kn)$ such that $k=O(n^{\alpha})$ .

Theorem 3.1.

Under the condition that $k$ is at least $O(1)$ and $\sum_{i=1}^{n}|G_{i}|^{2}=o(kn^{2})$ , the stochastic process $\{Z_{w}(\lfloor nu\rfloor):0<u<1\}$ is tight on the space $D[0+\epsilon,1-\epsilon]$ , where $\epsilon$ is a positive constant.

Theorem 3.2.

Under the condition that $k$ is at least $O(1)$ and $\sum_{i}|G_{i}|^{2}-\frac{4|G|^{2}}{n}$ is at least $O(k^{2})$ , the stochastic process $\{Z_{\text{diff}}(\lfloor nu\rfloor):0<u<1\}$ is tight on the space $D[0+\epsilon,1-\epsilon]$ , where $\epsilon$ is a positive constant.

These conditions are more relaxed than the conditions in Chen and Zhang (2015) and Chu and Chen (2019) when obtaining convergence in finite dimensional distributions.

Let $D=D[0,1]$ be the space of real functions $x$ on $[0,1]$ that are right-continuous and have left-hand limits:

(i)

For $0\leq t<1$ , $x(t+)=\lim_{s\downarrow t}x(t)$ exists and $x(t+)=x(t)$ . 2. (ii)

For $0\leq t<1$ , $x(t-)=\lim_{s\uparrow t}x(t)$ .

Functions satisfying these two properties are known as cadlag functions. A function $x$ is said to have a discontinuity of the first kind at $t$ if the left and right limits exist but differ and $x(t)$ lies between them. Any discontinuities of a cadlag function, an element of $D$ , are of the first kind. Since

[TABLE]

it follows that $Z_{w}(\lfloor nu\rfloor)$ and $Z_{\text{diff}}(\lfloor nu\rfloor)$ are right-continuous and have left-hand limits and therefore belong to the space $D$ .

The classical characterization of tightness on the space $D$ is given by Theorem 13.2 in Billingsley (1968), a version of which is presented here:

*A sequence of stochastic processes $\{X^{n}(u):0\leq u\leq 1\}$ in $D$ is tight if and only if: *

(i)

The sequence $\{X^{n}(u):0\leq u\leq 1\}$ is stochastically bounded in $D$ , 2. (ii)

For each $\epsilon>0$ ,

[TABLE]

where

[TABLE]

In general these conditions are difficult to verify, since they involve understanding the limit supreme of a sequence. We instead take an alternative approach and use the tightness criterion proposed by Kolmogorov-Chentsov (Chentsov (1956), Theorem 1); a variant can also be found in Billingsley (1968). The criterion is as follows:

A sequence of stochastic processes $X^{n}(u)$ , $n=1,2,\ldots,$ right continuous with left-hand limits, is tight if there are positive constants $C,\beta,\alpha$ not depending on $n$ such that for any $0\leq u\leq v\leq w\leq 1$ ,

[TABLE]

We set $\alpha=1,\beta=1$ so the condition becomes:

[TABLE]

where the notation $Z^{n}_{w}(u)=Z_{w}(\lfloor nu\rfloor)$ and $Z^{n}_{\text{diff}}(u)=Z_{\text{diff}}(\lfloor nu\rfloor)$ .

Both inequalities automatically hold when $(w-u)\leq\frac{1}{n}$ since at least one of the following is true: (i) $\lfloor nu\rfloor=\lfloor nv\rfloor$ , (ii) $\lfloor nv\rfloor=\lfloor nw\rfloor$ . In what follows, we focus on the case when $(w-u)>\frac{1}{n}.$

Observe that $Z^{n}_{w}(u)$ and $Z^{n}_{\text{diff}}(u)$ are not well-defined at the boundaries, when $u=0$ or $u=1$ . We further assume that $u,v,w=O(1)$ and therefore, cannot be too close to the boundaries. As such, we establish tightness on the domain $[0+\epsilon,1+\epsilon]$ , where $\epsilon$ is a positive constant. The proof of this result involves obtaining explicit expressions for the $4$ th moments and product moments of $Z_{w}$ and $Z_{\text{diff}}$ using combinatorial analysis. This involves determining the different graph configurations for 4 edges to be randomly selected (with replacement) from the graph and obtaining the probabilities that each configuration will occur for the graph. Focusing on the leading terms of each configuration, we show these are bounded by $C(w-u)^{2}$ .

4 Proof of Theorems 3.1 and 3.2

For simplicity, let $\lfloor nu\rfloor=r$ , $\lfloor nv\rfloor=s$ , and $\lfloor nw\rfloor=t$ and $r<s<t$ . Then, expanding (7), we have

[TABLE]

and similarly for $\mathbf{E}\left((Z^{n}_{\text{diff}}(v)-Z^{n}_{\text{diff}}(u))^{2}(Z^{n}_{\text{diff}}(w)-Z^{n}_{\text{diff}}(v))^{2}\right)$ (8).

For the two basic processes, the following analytical expressions are needed for $Z_{w}$

[TABLE]

and the following analytical expressions are needed for $Z_{\text{diff}}$

[TABLE]

It is straightforward to see that all the expressions can be decomposed as combinations of $R_{1}$ and $R_{2}$ . Since explicit expressions for the expectation, variance, and third moments of $R_{w}(\cdot)$ , $R_{\text{diff}}(\cdot)$ , and $R(\cdot)$ can be found in Chen and Zhang (2015) and Chu and Chen (2019), the remaining unknown quantities to be derived are the product moments of $R_{1}(\cdot)$ and $R_{2}(\cdot)$ , which can be expressed as

[TABLE]

where $a,b,c,d=0,1,2,3,4$ such that $a+b+c+d=4$ and $t^{\star}_{1},t^{\star}_{2},t^{\star}_{3},t^{\star}_{4}=r,s,t$ . The full list of product moments can be found in the Supplement A.

To derive the analytical expressions for the product moments we need to:

Determine different configurations for 4 edges to be randomly selected (with replacement) from the graph, 2. 2.

Derive probabilities separately for each configuration.

There are in total nineteen different configurations for four edges randomly chosen (with replacement) from the graph; see Figure 1 for an illustration of each configuration.

Let $G$ be the similarity graph and $G_{i}$ be the subgraph of $G$ containing all edges that connect to node $\textbf{y}_{i}$ . Then $|G_{i}|$ is the degree of node $\textbf{y}_{i}$ in $G$ . Among all $|G|^{4}$ possible ways of randomly selecting the four edges, the number of occurrences for each of the configuration are:

$|G|$ 2. 2)

$7x_{1}$ 3. 3)

$7|G|(|G|-1)-7x_{1}$ 4. 4)

$6x_{2}$ 5. 5)

$36x_{3}$ 6. 6)

$12x_{5}$ 7. 7)

$18x_{4}-72x_{3}+36x_{5}$ 8. 8)

$6|G|(|G|-1)(|G|-2)-12x_{5}-18x_{4}+36x_{3}-6x_{2}$ 9. 9)

$x_{6}$ 10. 10)

$12x_{7}-24x_{8}$ 11. 11)

$6x_{8}$ 12. 12)

$24x_{9}$ 13. 13)

$12x_{10}-48x_{9}$ 14. 14)

$4x_{11}-12x_{10}+24x_{9}$ 15. 15)

$24x_{12}-24x_{7}+24x_{8}$ 16. 16)

$8x_{13}-24x_{9}$ 17. 17)

$3x_{14}-12x_{7}+12x_{8}$ 18. 18)

$6x_{15}+36x_{7}-24x_{8}+72x_{9}-12x_{10}-48x_{12}-24x_{13}-6x_{14}$ 19. 19)

$12x_{10}-12x_{7}-x_{6}-4x_{11}+24x_{12}+3x_{14}-6x_{15}+6x_{8}+16x_{13}+|G|(|G|-1)(|G|-2)(|G|-3)$

with $x_{1},\ldots,x_{15}$ defined as:

[TABLE]

We will use two examples ( $\mathbf{E}(R_{1}^{2}(r)R_{1}(s)R_{1}(t))$ and $\mathbf{E}(R_{2}(r)R_{1}(s)R_{2}(s)R_{1}(t))$ ) to illustrate how to derive the probability for each configuration. The remaining product moments can be obtained in a similar way. The explicit formulas for all the product moments can be found in Supplement A.

Example 1: To derive the probability of each configuration for $\mathbf{E}(R_{1}^{2}(r)R_{1}(s)R_{1}(t))$ (Supplement (S62)) , observe that

[TABLE]

We derive $P(g_{i_{1}}(r)=g_{j_{1}}(r)=0,g_{i_{2}}(r)=g_{j_{2}}(r)=0,g_{i_{3}}(s)=g_{j_{3}}(s)=0,g_{i_{4}}(t)=g_{j_{4}}(t)=0)\triangleq P_{1}$ for each of the 19 configurations separately.

The four edges are actually the same edge.

[TABLE] 2. 2)

Three edges are the same and share one node with the fourth edge or two pairs of the edges are the same and share one node.

[TABLE] 3. 3)

Three edges are the same and do not share any node with the fourth edge or two pairs of the edges are the same and do not share any node with each other.

[TABLE] 4. 4)

Two edges are the same and share one node with the other two edges. None of them share the other node (star-shaped configuration).

[TABLE] 5. 5)

Linear chain of edges such that one edge shares one node with another edge and the share the other node with the third edge. The fourth edge can be the same as any of the other three edges.

[TABLE] 6. 6)

Two edges are the same and the edges form a triangle.

[TABLE] 7. 7)

Two edges share one node and do not share any node with the third edge. The fourth edge can be the same as any of the other three edges.

[TABLE] 8. 8)

Two edges are the same and no pair of edges share any node.

[TABLE] 9. 9)

The four edges share one node, and none of them share the other node (star-shaped).

[TABLE] 10. 10)

Linear chain of edges such that two distinct edges share one node with the other two edges and share a node with each other other.

[TABLE] 11. 11)

All four edges form a box.

[TABLE] 12. 12)

Three edges form a triangle and one edge connects to one node of the triangle.

[TABLE] 13. 13)

Three edges share the same node and the fourth edge shares the other node of one of the edges.

[TABLE] 14. 14)

Three edges share the same node and the fourth edge does not share any node with the other edges.

[TABLE] 15. 15)

Three edges form a linear chain and the fourth edge does not share any node with the other edges.

[TABLE] 16. 16)

Three edges form a triangle and the fourth edge does not share any node with the other edges.

[TABLE] 17. 17)

Two pairs of edges share one node with each other. The pairs of edges do not share any nodes with each other.

[TABLE] 18. 18)

Two edges share one node with each other. The other edges do not share any nodes with any of the other edges.

[TABLE] 19. 19)

None of the four edges share any node.

[TABLE]

Example 2: To derive the probability of each configuration for $\mathbf{E}(R_{2}(r)R_{1}(s)R_{2}(s)R_{1}(t))$ (Supplement (S84)) , observe that

[TABLE]

We derive $P(g_{i_{1}}(r)=g_{j_{1}}(r)=1,g_{i_{2}}(r)=g_{j_{2}}(r)=0,g_{i_{3}}(s)=g_{j_{3}}(s)=1,g_{i_{4}}(t)=g_{j_{4}}(t)=0)\triangleq P_{2}$ for each of the 19 configurations separately.

The four edges are actually the same edge.

[TABLE] 2. 2)

Three edges are the same and share one node with the fourth edge or two pairs of the edges are the same and share one node.

[TABLE] 3. 3)

Three edges are the same and do not share any node with the fourth edge or two pairs of the edges are the same and do not share any node with each other.

[TABLE] 4. 4)

Two edges are the same and share one node with the other two edges. None of them share the other node (star-shaped configuration).

[TABLE] 5. 5)

Linear chain of edges such that one edge shares one node with another edge and the share the other node with the third edge. The fourth edge can be the same as any of the other three edges.

[TABLE] 6. 6)

Two edges are the same and the edges form a triangle.

[TABLE] 7. 7)

Two edges share one node and do not share any node with the third edge. The fourth edge can be the same as any of the other three edges.

[TABLE] 8. 8)

Two edges are the same and no pair of edges share any node.

[TABLE] 9. 9)

The four edges share one node, and none of them share the other node (star-shaped).

[TABLE] 10. 10)

Linear chain of edges such that two distinct edges share one node with the other two edges and share a node with each other other.

[TABLE] 11. 11)

All four edges form a box.

[TABLE] 12. 12)

Three edges form a triangle and one edge connects to one node of the triangle.

[TABLE] 13. 13)

Three edges share the same node and the fourth edge shares the other node of one of the edges.

[TABLE] 14. 14)

Three edges share the same node and the fourth edge does not share any node with the other edges.

[TABLE] 15. 15)

Three edges form a linear chain and the fourth edge does not share any node with the other edges.

[TABLE] 16. 16)

Three edges form a triangle and the fourth edge does not share any node with the other edges.

[TABLE] 17. 17)

Two pairs of edges share one node with each other. The pairs of edges do not share any nodes with each other.

[TABLE] 18. 18)

Two edges share one node with each other. The other edges do not share any nodes with any of the other edges.

[TABLE] 19. 19)

None of the four edges share any node.

[TABLE]

For the remaining expressions, similar derivations using combinatorial analysis can be obtained.

4.1 Expression for $Z_{w}$

The similarity graph $G$ can be a generic graph constructed from a similarity measure, such as the Euclidean distance. Without loss of generality, $|G|=O(kn)$ with $k=O(n^{\alpha}),0\leq\alpha<1$ . We assume that $u,v,w=O(1)$ . To establish (7), we focus on the leading terms on the left-hand side of the inequality. After extensive simplification, the leading term for the denominator of $\mathbf{E}((Z^{n}_{w}(v)-Z^{n}_{w}(u))^{2}(Z^{n}_{w}(w)-Z^{n}_{w}(v))^{2})$ is

[TABLE]

The leading term for the numerator is:

[TABLE]

with

[TABLE]

Since $(w-v)(v-u)<(w-u)^{2}$ for $u<v<w$ , the expression $\text{num}_{Z_{w}}/\text{den}_{Z_{w}}$ can be bounded by $C(w-u)^{2}$ as long as the ratio of graph configurations in the numerator and denominator can be bounded asymptotically by $O(1)$ . Specifically, since $u,v,w=O(1)$ , the terms $C_{w,1},\ldots,C_{w,12}$ can be bounded asymptotically by a constant. The remaining terms in the numerator involve configurations of the graph: $k$ , $n$ , $\sum_{i=1}^{n}|G_{i}|^{4}$ , $\sum_{i=1}^{n}|G_{i}|^{3}$ , $\sum_{i=1}^{n}|G_{i}|^{2}$ , $\sum_{i}\sum_{j\in G_{i};j\neq i}(|G_{i}|-1)^{2}(|G_{j}|-1)$ , $\sum_{i,j\in G}(|G_{i}|-1)(|G_{j}|-1)$ , $x_{7}$ , $x_{8}$ , and $x_{9}$ . If the ratio of each of these terms with the denominator’s $(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}$ is bounded by $O(1)$ , then the entire expression can be asymptotically bounded by a constant $C_{w}$ times $(w-u)^{2}$ .

In the following, we assume that $\sum_{i=1}^{n}|G_{i}|^{2}=o(kn^{2})$ and we check each configuration (in their order of appearance).

Clearly $\frac{k^{2}n^{4}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1)$ .

For $x_{14}$ , we have

[TABLE]

Then $x_{14}<(\sum_{i=1}^{n}|G_{i}|^{2})^{2}$ and $\frac{x_{14}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1)$ . Following similar arguments, since $\sum_{i=1}^{n}|G_{i}|^{2}=o(kn^{2})$ , we have $\frac{\sum_{i=1}^{n}|G_{i}|^{4}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1)$ and $\frac{n\sum_{i=1}^{n}|G_{i}|^{3}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1)$ .

For $\sum_{i}\sum_{j\in G_{i};j\neq i}(|G_{i}|-1)^{2}(|G_{j}|-1)$ , we have

[TABLE]

Since the the largest $|G_{i}|$ can be is $n-1$ (every other observation connects to node $\textbf{y}_{i}$ ), it follows that $2|G|\sum_{i=1}^{n}|G_{i}|^{3}\precsim 2|G|kn^{3}\asymp k^{2}n^{4}$ and $\frac{k^{2}n^{4}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1)$ .

Similarly, since $kn^{2}\sum_{i=1}^{n}|G_{i}|^{2}\precsim k^{2}n^{4}$ , we have $\frac{kn^{2}\sum_{i=1}^{n}|G_{i}|^{2}}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1).$

We have $\sum_{(i,j)\in G}(|G_{i}|-1)(|G_{j}|-1)<\sum_{i=1}^{n}|G_{i}|(|G|-|G_{i}|)=|G|\sum_{i=1}^{n}|G_{i}|-\sum_{i=1}^{n}|G_{i}|^{2}<2|G|^{2}\asymp 2k^{2}n^{2}$ , and so $\frac{\sum_{i,j\in G}(|G_{i}|-1)(|G_{j}|-1)}{(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}}\precsim O(1).$

Finally, since

[TABLE]

it follows that the ratio of the these configurations with $(kn^{2}-\sum_{i=1}^{n}|G_{i}|^{2})^{2}$ are bounded asymptotically by $O(1)$ .

4.2 Expression for $Z_{\text{diff}}$

We adopt a similar approach for $Z_{\text{diff}}$ : we study the analytical expression for $\mathbf{E}\left((Z^{n}_{\text{diff}}(v)-Z^{n}_{\text{diff}}(u))^{2}(Z^{n}_{\text{diff}}(w)-Z^{n}_{\text{diff}}(v))^{2}\right)$ . This expression can be written as the combination of terms involving $u,v,$ and $w$ and terms involving configurations from the graph. We first show that the expressions involving $u,v,$ and $w$ can be bounded by $C(w-u)^{2}$ or $C(w-u)$ . We then show that the graph-configurations are bounded asymptotically by $O(1)$ or $O(1/n)$ . It follows then that the entire expression can be bounded by a constant $C_{\text{diff}}$ times $(w-u)^{2}$ .

Let $e_{v}=v(1-v)$ , $e_{w}=w(1-w)$ , and $e_{u}=u(1-u)$ . The leading term for the denominator of $\mathbf{E}\left((Z^{n}_{\text{diff}}(v)-Z^{n}_{\text{diff}}(u))^{2}(Z^{n}_{\text{diff}}(w)-Z^{n}_{\text{diff}}(v))^{2}\right)$ is:

[TABLE]

with $V_{G}=\sum_{i}|G_{i}|^{2}-4|G|^{2}/n$ .

For the numerator of $\mathbf{E}\left((Z^{n}_{\text{diff}}(v)-Z^{n}_{\text{diff}}(u))^{2}(Z^{n}_{\text{diff}}(w)-Z^{n}_{\text{diff}}(v))^{2}\right)$ , we group the leading terms by their graph configurations. The numerator can be expressed as

[TABLE]

We first show that the coefficients $K_{1}(u,v,w),K_{2}(u,v,w),K_{3}(u,v,w),K_{4}(u,v,w)$ , and $K_{5}(u,v,w)$ can be bounded by $C(w-u)^{2}$ or $C(w-u)$ .

$K_{1}(u,v,w)$ : The leading coefficient for $k^{4}n^{2}$ can be expanded as

[TABLE]

with

[TABLE]

It is clear that $C_{d,1}(w-v)^{2}+C_{d,2}(v-u)(w-v)\leq C(w-u)^{2}$ since $C_{d,1}(w-v)^{2}+C_{d,2}(v-u)(w-v)\leq(C_{d,1}+C_{d,2})(w-u)^{2}$ and $C$ can be chosen to be large enough such that $C_{d,1}+C_{d,2}\leq C$ . In the following we focus on the next two terms. For the third term, we need to show that $\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})\leq(v-u)$ . Let $\delta=v-u$ and define

[TABLE]

which is continuous everywhere on $0\leq\delta\leq 1-u$ .

If $g(\delta)$ is convex for $0\leq\delta\leq 1-u$ , it follows that $g(\delta)\leq\delta$ . Since $g(0)=0$ and $g(1-u)=u(1-u)\leq 1(-u)$ , what remains is to check its second derivative is non-negative:

[TABLE]

Since we have established that $g(\delta)=\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})$ is convex, it follows that $\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})\leq(v-u)$ and $\sqrt{e_{v}}(\sqrt{e_{v}}-\sqrt{e_{w}})\leq(w-v)$ . Moreover, the minimum of $g(\delta)$ is achieved when $\delta=0.5-u$ and $-g(0.5-u)=\sqrt{e_{u}}(\frac{1}{2}-\sqrt{e_{u}})\leq\frac{1}{2}-u$ , for $u<\frac{1}{2}$ . Therefore $|\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})|\leq(v-u)$ .

Following a similar argument, we can establish that $\sqrt{u(1-v)}(\sqrt{v(1-u)}-\sqrt{u(1-v)})\leq(v-u)$ . Let $h(\delta)=\sqrt{(u+\delta)(1-u)}\left(\sqrt{(u+\delta)(1-u)}-\sqrt{u(1-u-\delta)}\right)$ . We have $h(0)=0$ and $h(1-u)=1-u$ . Its first and second derivatives are

[TABLE]

and therefore $\sqrt{v(1-u)}(\sqrt{v(1-u)}-\sqrt{u(1-v)})\leq(v-u)$ . Since $\sqrt{u(1-v)}<\sqrt{v(1-u)}$ , it follows that $\sqrt{u(1-v)}(\sqrt{v(1-u)}-\sqrt{u(1-v)})\leq(v-u)$ . Note that $\sqrt{v(1-u)}-\sqrt{u(1-v)}>0$ .

Therefore, $K_{1}(u,v,w)\leq C(w-u)^{2}$ for some constant $C$ . 2. 2.

$K_{2}(u,v,w)$ : The leading coefficient for $k^{2}n(\sum_{i=1}^{n}|G_{i}|^{2})$ is

[TABLE]

with

[TABLE]

Since $u<v<w$ , we have $C_{d,5}(v_{u})^{2}+C_{d,6}(w-u)(v-u)\leq C(w-u)^{2}$ for some constant $C$ . In order to show that remaining terms can also be bounded by $C(w-u)^{2}$ , we follow that same argument detailed above for $K_{1}(u,v,w)$ . Observe that $|\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})|\leq(v-u)$ and $|\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{w}})|\leq(w-u)$ . It follows that terms with $C_{d,7},C_{d,8},$ and $C_{d,9}$ of $K_{2}(u,v,w)$ can be by bounded by $C(w-u)^{2}$ as well.

Finally, for the last term in $K_{2}(u,v,w)$ , we see that $\sqrt{u(1-w)}(\sqrt{w(1-u)}-\sqrt{u(1-w)})\leq(w-u)$ and $\sqrt{w(1-u)}(\sqrt{w(1-u)}-\sqrt{u(1-w)})\leq(w-u)$ .

It follows that $K_{2}(u,v,w)\leq C(w-u)^{2}$ for some constant $C$ . 3. 3.

$K_{3}(u,v,w)$ : The leading coefficient for $\sum_{i=1}^{n}|G_{i}|^{4}$ is

[TABLE]

with

[TABLE]

Again, the first two terms involving $C_{d,11}$ and $C_{d,12}$ can be bounded by $C(w-u)^{2}$ . Repeating the convexity argument, $|\sqrt{e_{v}}(\sqrt{e_{v}}-\sqrt{e_{u}})|\leq(v-u)$ , which allow us to bound the remaining terms by $C(w-u)^{2}$ as well. Therefore, the entire expression $K_{3}(u,v,w)$ can also be bounded by $C(w-u)^{2}$ . 4. 4.

$K_{4}(u,v,w)$ : The leading coefficient for $k\sum_{i=1}^{n}|G_{i}|^{3}$ is

[TABLE]

with

[TABLE]

The first two terms involving $C_{d,17}$ , $C_{d,18}$ , and $C_{d,20}$ can be bounded by $C(w-u)^{2}$ . Repeating a combination of the convexity arguments from above, the remaining terms can also be bounded by $C(w-u)^{2}$ . It follows that $K_{4}(u,v,w)\leq C(w-u)^{2}$ . 5. 5.

$K_{5}(u,v,w)$ : The leading coefficient for $x_{14}$ is

[TABLE]

with

[TABLE]

and utilizing the arguments above, this term is also bounded by $C(w-u)^{2}$ . 6. 6.

$K_{6}(u,v,w)$ : The leading coefficient for $\sum_{i}\sum_{j\in G_{i};j\neq i}(|G_{i}|-1)^{2}(|G_{j}|-1)$ is

[TABLE]

with

[TABLE]

We have that $C_{d,17}(v-u)$ can be bounded by a constant $C(w-u)$ and by convexity, $C_{d,29}\sqrt{e_{u}}(\sqrt{e_{u}}-\sqrt{e_{v}})\leq C(w-u)$ . Therefore, the leading coefficient $K_{6}(u,v,w)$ is bounded by $C(w-u)$ .

Although we have established that the coefficients $K_{1}(u,v,w),K_{2}(u,v,w),\ldots K_{6}(u,v,w)$ can be bounded, in order for the entire expression to be bounded by $C(w-u)^{2}$ we need the graph configurations in the numerator and denominator to be bounded by $O(1)$ or $O(1/n)$ . Recall that the leading term is the denominator is $(nV_{G})^{2}$ . Let $\tilde{d_{i}}=|G_{i}|-\frac{2|G|}{n}$ , then $V_{G}=\sum_{i=1}^{n}\tilde{d_{i}}^{2}$ . The graph configurations in the numerator involve:

$k^{4}n^{2}$ 2. 2.

$k^{2}n\sum_{i=1}^{n}|G_{i}|^{2}$ 3. 3.

$\sum_{i=1}^{n}|G_{i}|^{4}$ 4. 4.

$k\sum_{i=1}^{n}|G_{i}|^{3}$ 5. 5.

$x_{14}$ 6. 6.

$\sum_{i}\sum_{j\in G_{i};j\neq i}(|G_{i}|-1)^{2}(|G_{j}|-1)$

Let $k=O(n^{\alpha}),0\leq\alpha<1$ . Suppose the largest (centered) degree $\tilde{d}_{i}\precsim O(n^{\beta})$ , where $0\leq\beta<1$ .

We first focus on the second configuration 2 in the numerator, we have:

[TABLE]

Since $k^{2}n\precsim O(n^{2\alpha+1})$ , it follows that the entire expression $kn^{2}\sum_{i=1}^{n}|G_{i}|^{2}\precsim n^{2\beta+2\alpha+2}+n^{4\alpha+2}$ .

In the denominator, if $\alpha\leq\beta$ , then $V_{G}=\sum_{i=1}^{n}\tilde{d_{i}}^{2}\succsim n^{2\beta}$ , and $(nV_{G})^{2}\succsim n^{4\beta+2}$ . Then the ratio of the numerator 2 and denominator gives us

[TABLE]

If $\alpha>\beta$ , then $k^{2}n\sum_{i=1}^{n}|G_{i}|^{2}\precsim n^{4\alpha+2}$ . With the assumption that $V_{G}\succsim k^{2}\asymp n^{2\alpha}$ , we have $(nV_{G})^{2}\succsim n^{4\alpha+2}$ . Other terms can be done in a similar way. Notice that:

$k^{4}n^{2}\precsim O(n^{4\alpha+2})$ . 2. 3.

$\sum_{i=1}^{n}|G_{i}|^{4}=\sum_{i=1}^{n}(\tilde{d_{i}}+\frac{2|G|}{n})^{4}\precsim\sum_{i=1}^{n}(n^{\beta}+n^{\alpha})^{4}\precsim n^{4\beta+1}+n^{4\alpha+1}.$ 3. 4.

$k\sum_{i=1}^{n}|G_{i}|^{3}\precsim n^{\alpha}\sum_{i=1}^{n}(n^{\beta}+n^{\alpha})^{3}\precsim n^{3\beta+\alpha+1}+n^{4\alpha+1}$ . 4. 5.

$x_{14}=\sum_{i}\sum_{j\neq i}(|G_{i}\setminus\{j\in G_{i}\}|)(|G_{i}\setminus\{j\in G_{i}\}|-1)(|G_{j}\setminus\{i\in G_{j}\}|)(|G_{j}\setminus\{i\in G_{j}\}|-1)\precsim\sum_{i=1}|G_{i}|^{2}\sum_{j=1}|G_{j}|^{2}\precsim\sum_{i,j}^{n}(n^{\beta}+n^{\alpha})^{4}\precsim n^{4\beta+2}+n^{4\alpha+2}.$ 5. 6.

$\sum_{i=1}\sum_{j\in G_{i};j\neq i}(|G_{i}|-1)^{2}(|G_{j}|-1)\precsim\sum_{i=1}\sum_{j\in G_{i};j\neq i}|G_{i}|^{2}|G_{j}|\precsim n^{3\beta+1+\alpha}$ . $\square$

Therefore, the ratio of the first 5 configurations can be bounded by $O(1)$ and the 6th configuration can be bounded by $O(1/n)$ . To see that the 6th configuration can be bounded by $O(1/n)$ , consider that if $\alpha\leq\beta$ , then $(nV_{G})^{2}\succsim n^{4\beta+2}$ and the ratio of the numerator and denominator is $\frac{1}{n^{(1+\beta-\alpha)}}$ . If $\alpha>\beta$ , then $(nV_{G})^{2}\succsim n^{4\alpha+2}$ and the ratio becomes $\frac{1}{n^{(3(\alpha-\beta)+1)}}.$ Recall that expression for $Z_{\text{diff}}$ can be expressed as the linear combination of the leading coefficients $K_{1}(u,v,w),\ldots,K_{6}(u,v,w)$ multiplied by their respective graph configurations. We have established that $K_{1}(u,v,w),\ldots,K_{5}(u,v,w)$ are bounded by $C(w-u)^{2}$ and $K_{6}(u,v,w)$ is bounded by $C(w-u)$ . Combining these results, and that we are considering the case that $(w-u)>\frac{1}{n}$ , it follows that the expression for $Z_{\text{diff}}$ can be bounded by $C(w-u)^{2}$ .

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aue et al. (2009) {barticle} [author] \bauthor \bsnm Aue, \bfnm Alexander \binits A., \bauthor \bsnm Hörmann, \bfnm Siegfried \binits S., \bauthor \bsnm Horváth, \bfnm Lajos \binits L., \bauthor \bsnm Reimherr, \bfnm Matthew \binits M. \betal et al. ( \byear 2009). \btitle Break detection in the covariance structure of multivariate time series models. \bjournal The Annals of Statistics \bvolume 37 \bpages 4046–4087. \endbibitem
2Billingsley (1968) {barticle} [author] \bauthor \bsnm Billingsley, \bfnm Patrick \binits P. ( \byear 1968). \btitle Convergence of Probability Measures. \endbibitem
3Chen and Zhang (2015) {barticle} [author] \bauthor \bsnm Chen, \bfnm Hao \binits H. and \bauthor \bsnm Zhang, \bfnm Nancy \binits N. ( \byear 2015). \btitle Graph-based change-point detection. \bjournal The Annals of Statistics \bvolume 43 \bpages 139–176. \endbibitem
4Chentsov (1956) {barticle} [author] \bauthor \bsnm Chentsov, \bfnm Nikolai N \binits N. N. ( \byear 1956). \btitle Weak convergence of stochastic processes whose trajectories have no discontinuities of the second kind and the “heuristic” approach to the Kolmogorov-Smirnov tests. \bjournal Theory of Probability & Its Applications \bvolume 1 \bpages 140–144. \endbibitem
5Chu and Chen (2019) {barticle} [author] \bauthor \bsnm Chu, \bfnm Lynna \binits L. and \bauthor \bsnm Chen, \bfnm Hao \binits H. ( \byear 2019). \btitle Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data. \bjournal The Annals of Statistics \bvolume 47 \bpages 382–414. \endbibitem
6Dubey and Müller (2020) {barticle} [author] \bauthor \bsnm Dubey, \bfnm Paromita \binits P. and \bauthor \bsnm Müller, \bfnm Hans-Georg \binits H.-G. ( \byear 2020). \btitle Fréchet change-point detection. \bjournal The Annals of Statistics \bvolume 48 \bpages 3312–3335. \endbibitem
7Frick, Munk and Sieling (2014) {barticle} [author] \bauthor \bsnm Frick, \bfnm Klaus \binits K., \bauthor \bsnm Munk, \bfnm Axel \binits A. and \bauthor \bsnm Sieling, \bfnm Hannes \binits H. ( \byear 2014). \btitle Multiscale change point inference. \bjournal Journal of the Royal Statistical Society: Series B (Statistical Methodology) \bvolume 76 \bpages 495–580. \endbibitem
8Garreau and Arlot (2018) {barticle} [author] \bauthor \bsnm Garreau, \bfnm Damien \binits D. and \bauthor \bsnm Arlot, \bfnm Sylvain \binits S. ( \byear 2018). \btitle Consistent change-point detection with kernels. \bjournal Electronic Journal of Statistics \bvolume 12 \bpages 4440–4486. \endbibitem

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

On the Tightness of Graph-based Statistics

Abstract

keywords:

keywords:

1 Introduction

2 Review of the graph-based framework

2.1 Notation

3 Tightness of basic processes

3.1 Asymptotic null distributions of the basic processes

3.2 Main Results

Theorem 3.1**.**

Theorem 3.2**.**

4 Proof of Theorems 3.1 and 3.2

4.1 Expression for ZwZ_{w}Zw​

4.2 Expression for ZdiffZ_{\text{diff}}Zdiff​

Theorem 3.1.

Theorem 3.2.

4.1 Expression for $Z_{w}$

4.2 Expression for $Z_{\text{diff}}$