Poisson fluctuations for edge counts in high-dimensional random geometric graphs
Jens Grygierek

TL;DR
This paper establishes a Poisson limit theorem for edge counts in high-dimensional random geometric graphs, demonstrating phase transition phenomena as dimension and intensity grow, using the Malliavin-Stein method.
Contribution
It introduces a novel Poisson approximation result for geometric graph edge counts in high dimensions, extending previous normal approximation bounds.
Findings
Poisson limit theorem for edge counts in high-dimensional graphs
Quantitative bounds involving first and second order difference operators
Phase transition phenomenon confirmed in high-dimensional setting
Abstract
We prove a Poisson limit theorem in the total variation distance of functionals of a general Poisson point process using the Malliavin-Stein method. Our estimates only involve first and second order difference operators and are closely related to the corresponding bounds for the normal approximation in the Wasserstein distance by Last, Peccati and Schulte (2016). As an application of this Poisson limit theorem, we consider a stationary Poisson point process in and connect any two points whenever their distance is less than or equal to a prescribed distance parameter. This construction gives rise to the well known random geometric graph. The number of edges of this graph is counted that have a midpoint in the -dimensional unit ball. A quantitative Poisson limit theorem for this counting statistic is derived, as the space dimension and the intensity of the Poisson…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Stochastic processes and statistical mechanics · Point processes and geometric inequalities
Poisson fluctuations for edge counts in high-dimensional random geometric graphs
Jens Grygierek111Institute of Mathematics, Osnabrück University, Germany. Email: [email protected]
Abstract
We prove a Poisson limit theorem in the total variation distance of functionals of a general Poisson point process using the Malliavin-Stein method. Our estimates only involve first and second order difference operators and are closely related to the corresponding bounds for the normal approximation in the Wasserstein distance by Last, Peccati and Schulte, see [LPS16]. As an application of this Poisson limit theorem, we consider a stationary Poisson point process in and connect any two points whenever their distance is less than or equal to a prescribed distance parameter. This construction gives rise to the well known random geometric graph. The number of edges of this graph is counted that have a midpoint in the -dimensional unit ball. A quantitative Poisson limit theorem for this counting statistic is derived, as the space dimension and the intensity of the Poisson point process tend to infinity simultaneously, extending our previous work, [GT16] where we derived a central limit theorem, showing that the phase transition phenomenon holds also in the high-dimensional set-up.
Keywords. Poisson limit theorem, edge counting statistic, high dimensional random geometric graph, Poisson point process, second-order Poincaré inequality, stochastic geometry, Mehler’s formula, Stein’s method, Malliavin calculus, phase transition
MSC (2010). 60D05, 60F05
1 Introduction and main results
Fix an intensity and a distance parameter and let be a stationary Poisson point process in , with intensity . The points of are taken as the vertices of a random graph and we connect any two distinct vertices by an edge provided that their distance is less than or equal to . By this construction the random geometric graph in arises.
This paper is a direct continuation of [GT16], where we have derived a quantitative central limit theorem for the number of edges that have their midpoint in the -dimensional unit ball as the space dimension and the intensity tend to infinity simultaneously such that the expectation of the considered edge counting statistic tends to infinity. In this paper we derive the corresponding Poisson limit theorem in the case that the expectation tends to a positive but finite constant by first proving a general Poisson limit theorem for Poisson functionals using the Malliavin-Stein method, that comes in the taste of the remarkable central limit theorem [LPS16, Theorem 1.1].
1.1 Poisson approximation for -valued Poisson functionals
We first rephrase a version of the main result from [LPS16], a so-called second order Poincaré inequality for Poisson functionals, see also [LP18, Theorem 2.13], that only involves moments of first and second order difference operators.
Theorem 1.1**.**
Let be a Poisson functional such that and . Define
[TABLE]
and let be a standard Gaussian random variable, then
[TABLE]
where denotes the Wasserstein-distance, see Definition 2.1.
Replacing the third term in the approximation bound with
[TABLE]
we can formulate the analogue of Theorem 1.1 for Poisson approximation, which is our first main result and will be used later to derive Theorem 1.4.
Theorem 1.2** (Poisson Approximation).**
Let be a Poisson point process on with -finite non-atomic intensity measure and let be an -valued Poisson functional satisfying . Further, let be a Poisson distributed random variable with parameter . Then
[TABLE]
where denotes the total variation distance, see Definition 2.2.
Note that and were also used before in the central limit theorem, which will be useful in the proof of our second main result, since it allows us to reuse some of the calculations we did in the previous work [GT16].
1.2 Poisson fluctuations for edge counts in high-dimensional random geometric graphs
Let be a stationary Poisson point process on with dimension-dependent intensity , i.e. the intensity measure is given by , where denotes the -dimensional Lebesgue measure. We choose a dimension-dependent distance parameter with for , namely we take
[TABLE]
which implies that for all . The motivation for our choice is explained in Remark 5.2 below, where we also give the precise conditions for to allow for more general choices. We notice that . Finally we choose the dimension-dependent intensity such that for .
Let denote the number of edges of the random geometric graph that have their midpoint in the -dimensional unit ball , that is the edge-counting statistic given by
[TABLE]
To simplify our notation we shall use the abbreviation for . The expectation and the variance of was already derived in our previous work [GT16, eq. 4, eq. 5, Lemma 7], namely:
[TABLE]
and
[TABLE]
Here and below, denotes the volume of the -dimensional unit ball. Note that the exponential decay of behaves like , as , according to Stirling’s formula.
We investigate the asymptotic distributional behavior of as and the intensity as well as the space dimension tend to infinity simultaneously. This set-up is opposed to the most of the existing literature in which the focus lies on random geometric graphs in with some fixed space dimension , see [Bub+16] and [Dev+11] for notable exceptions, where, however, questions concerning the high-dimensional fluctuations are not touched.
The asymptotic behavior of depends on how fast the sequence increases as . This phenomenon is quite common for asymptotic results related to edge counts (or more generally subgraph counts) and component counts. In particular, here, one has to distinguishes the following phases, determined by the limit of the expectation :
[TABLE]
Remark 1.3**.**
If the expectation tends to infinity (1) the edge-counting statistic satisfies a central limit theorem, see [GT16, Theorem 1].
In this paper, we obtain a Poisson limit theorem for a finite non-zero limit (2) showing that the phase-transition phenomenon for the edge-counting statistic holds also in the high-dimensional set-up:
Theorem 1.4**.**
Assume for and let be a Poisson distributed random variable with parameter . Then one can find absolute constants such that
[TABLE]
whenever . In particular, one has that
[TABLE]
Remark 1.5**.**
If the expectation tends to zero, (3), we also have , indicating that the edge-counting statistic vanishes in the limit, since the random graph contains almost surely no edges.
The rest of this text is structured as follows. In Section 2 we recall some necessary background material on Poisson functionals and the Malliavin-Stein method. In particular we introduce Mehler’s formula that will be the core ingredient in the proof of Theorem 1.2 in Section 3. In Section 4 we derive a general bound for second order -statistics. The final Section 5 contains the proof of Theorem 1.4.
2 Preliminaries
The -dimensional Euclidean space is denoted by and we let be the Borel -field on . The Lebesgue measure on is indicated by . A -dimensional ball with radius and center in is defined by
[TABLE]
where stands for the usual Euclidean norm. We shall write instead of and denote by
[TABLE]
the volume of the -dimensional unit ball , where is Euler’s gamma function.
We will use the Wasserstein-distance for the normal approximation and the total variation distance for the Poisson approximation, see for instance [BP16, Section 2.1].
Definition 2.1**.**
We denote by the class of Lipschitz functions with Lipschitz constant less or equal to one, i.e. is absolutely continuous and almost everywhere differentiable with . Given two -valued random variables , with and the Wasserstein distance between the laws of and , written is defined as
[TABLE]
Definition 2.2**.**
Given two -valued random variables , the total variation distance between the laws of and , written is defined as
[TABLE]
2.1 Poisson functionals and Malliavin-Stein Method
Let be a Borel measure space with -finite and non-atomic measure such that . For and we denote by the set of all measurable functions such that .
We use the symbol to indicate the class of all -finite measures on with for all and supply the space with the smallest -field such that all mappings of the form with and are measurable.
It will be convenient for us to identify a counting measure with its support and to write if the point is charged by . The Dirac measure concentrated at a point is denoted by . This construction mostly follows [Pec12] and [LPS16]. We let be our underlying probability space and denote by , , the space of all random variables such that .
Consider a -finite non-atomic measure on . A Poisson point process with intensity measure is a random counting measure on , that is a random element in , such that
- a)
For all and all it holds, that , i.e.,
[TABLE]
and for , we set for all . 2. b)
For all and all pairwise disjoint measurable sets , the random variables are independent.
By a Poisson functional we understand a random variable , that is almost surely of the form , where is some measurable function, the so-called representative of . For a Poisson functional with representative and we define the first-order difference operator
[TABLE]
and for points the -th-order difference operator is defined inductively by
[TABLE]
where . Note that this definition does not depend on the choice of the representative -a.e. and -a.s. and further that is symmetric in the arguments .
In the following we will denote by resp. the mappings
[TABLE]
For a short introduction to the Malliavin-Calculus we recall some of the important tools in the development of the theory. For a deeper discussion of Fock Spaces and Chaos Expansion as well as Malliavin-Calculus and Malliavin-Stein Method we refer the reader to [Las16] and the books [LP18, PR16]. We introduce the notion of the Wiener-Itô chaos expansion, see [LPS16] and the references therein, especially [LP11] for more details and proofs.
Every Poisson functional admits a representation of the type
[TABLE]
where the series coverges in . For each , the kernel is given by the (scaled) expectation of the -order difference operator, i.e. and denotes the -th order Wiener-Itô integral. This representation is known as Wiener-Itô chaos expansion of .
We say a Poisson functional lies in the domain of , , if
[TABLE]
In this case is called the Malliavin derivative operator associated with the Poisson process , and it holds -a.s. and -a.e., , that
[TABLE]
where the right hand side is the definition of the Malliavin derivative operator and the left hand side is the path-wise defined first-order difference operator given by (4).
Note that the following Lemma can be used to easily check if a Poisson functional lies in the domain of .
Lemma 2.3** ([PT13, Lemma 3.1]).**
Let denote a Poisson functional with representative such that
[TABLE]
Then .
The Wiener Itô chaos expansion gives rise to the Ornstein-Uhlenbeck generator , that is defined for all Poisson functionals , i.e.
[TABLE]
by
[TABLE]
and its (pseudo) inverse is given by
[TABLE]
In [Pec+10, Section 3, Theorem 3.1] the Malliavin-Calculus was combined with Stein’s method to derive a bound on the Wasserstein distance between the law of a standardized Poisson Functional and the standard Gaussian distribution. This bound as well as the bound derived in [Pec12, Theorem 3.1], stated here as Theorem 3.1 for Poisson approximation in the total variation distance rely on the inverse of the Ornstein-Uhlenbeck generator , which generally requires the calculation of the Wiener-Itô chaos expansion of . In [LPS16] this was solved for the normal approximation case by establishing and applying a general Mehler formula for Poisson processes which allows to represent the inverse Ornstein-Uhlenbeck generator in terms of thinned Poisson point processes to derive bounds that only rely on the moments of the first- and second-order difference operators and .
2.2 Mehler’s formula
For the sake of brevity we only introduce Mehler’s formula and the derived results we will need in the proof of Theorem 1.2 and refer the reader for the full coverage to [LPS16].
Let and denote by the -thinning of our Poisson point process and by the distribution of a Poisson point process with intensity measure . We define the operator by
[TABLE]
where the conditional expectation is taken with respect to the random thinning and the Poisson point process , conditioned on . Using the operator , we derive Mehler’s formula:
Theorem 2.4** (Mehler’s formula, [LPS16, Theorem 3.2]).**
Let be a Poisson functional and , then we have -a.s. that
[TABLE]
We will need the following inequalities in the proof of our Poisson limit theorem, Theorem 1.2.
Lemma 2.5** ([LPS16, Lemma 3.4]).**
Let be a Poisson functional and , then
[TABLE]
and
[TABLE]
Since and , we can rephrase the result on the covariance, see [LPS16, Theorem 4.1], to obtain a result on the variance of our Poisson functional :
Theorem 2.6**.**
Let , then
[TABLE]
3 Proof of Theorem 1.2
Let us first recall the Malliavin bounds for Poisson approximation from [Pec12, Theorem 3.1]:
Theorem 3.1**.**
Let be a Poisson point process on with -finite and non-atomic intensity measure and let be an -valued Poisson functional satisfying . Further let be a Poisson distributed random variable with parameter . Then
[TABLE]
The main idea of the proof is to take Mehler’s formula and its application from [LPS16, Sections 3 and 4] and adapt this technique for the bound given by Theorem 3.1.
- Proof of Theorem 1.2:
Using the Cauchy-Schwarz inequality we can bound the first term by
[TABLE]
and apply Theorem 2.6 to derive
[TABLE]
which yields the first part of our bound
[TABLE]
The second term can be bounded by using Fubini’s theorem and Hölders-inequality with parameters . Thus
[TABLE]
which can be bounded using Lemma 2.5 by
[TABLE]
yielding the second part of our bound
[TABLE]
completing the proof of Theorem 1.2. ∎
4 A general bound for second-order -statistics
In this section, we adapt the general bound for the normal approximation of second-order -statistics, that was provided in [GT16, Section 3] to the Poisson case, showing that some of the previous results therein can be reused. Let denote a second-order -statistics in the sense of [RS13] based on a Poisson point process in having intensity measure . Formally we define
[TABLE]
and assume that is a symmetric measurable function, which we allow to depend on the space dimension . Furthermore, we assume that . Finally we define the two parameter integrals
[TABLE]
cf. [GT16, Section 3], where we already omit the exponents of .
Following [GT16, Section 3], by Mecke’s formula we have that
[TABLE]
and
[TABLE]
Next, we compute the expectations occurring at the right-hand side of Theorem 1.2 to prepare the bounds for the three terms , , and .
Lemma 4.1**.**
Let . Then
- (a)
, 2. (b)
, 3. (c)
, with
[TABLE] 4. (d)
, with
[TABLE] 5. (e)
.
- Proof:
Assertions (b), (c) and (e) are following directly from [GT16, Lemma 3] using . Additionally the proof of (a) is similar to the proof of (b) writing
[TABLE]
To prove d) we write
[TABLE]
and obtain using (a) and (b) combined with and (c). ∎
We shall now provide the announced expressions for the terms , and .
Lemma 4.2**.**
We have that
[TABLE]
- Proof:
The expression for and are following similar to [GT16, Lemma 4] by replacing the standardized -statistics with the non-standardized . Using Lemma 4.1 (a) and (d) we have that
[TABLE]
and the proof is complete. ∎
Now we can combine these expressions established so far to reformulate Theorem 1.2 for our second-order -statistic .
Proposition 4.3**.**
Let be a Poisson point process on with -finite non-atomic intensity measure and let be a second-order -statistic with symmetric kernel . Suppose that . Defining
[TABLE]
one has that
[TABLE]
where is a Poisson distributed random variable with parameter .
5 Proof of Theorem 1.4
Let us recall that denotes a stationary Poisson point process on with intensity given by (2). We denote by the intensity measure of , that is, is times the Lebesgue measure on . Moreover, from now on we will assume without loss of generality that all the random variables are defined on a common probability space .
It easy to see, that the edge counting statistic is a second-order -statistic with measurable, symmetric and -dependent kernel , given by
[TABLE]
To derive Theorem 1.4 we apply the Poisson approximation bound derived in Proposition 4.3 using the bounds on the parameter integrals and the expectation and variance of given by [GT16, eq. 15, Lemma 6, Lemma 7]. We have
[TABLE]
and
[TABLE]
Lemma 5.1**.**
Let be the function given by (7). Then for all it holds that
[TABLE]
Remark 5.2** (cf. [GT16, Remark 8]).**
Our particular choice ensures that we can find absolute constants and such that
[TABLE]
and
[TABLE]
for all . The existence of such constants is important to derive the final bounds on the right hand side of our main result and implies restrictions to more general choices of , see the proof of Lemma 5.4. If one is only interested in the Poisson limit, the first condition (11) can be omitted, since it is only involved in the lower variance bound used in the Gaussian approximations, see [GT16, Lemma 11 and eq. 20].
In the next step, we check the integrability condition in Proposition 4.3. Note that this condition determines the limiting distribution, yielding the Gaussian limit if (1) holds resp. the Poisson limit if (2) holds:
Lemma 5.3**.**
If (1) holds, we have
[TABLE]
and [GT16, Theorem 1] yields the Gaussian limit for the standardized edge counting statistics .
If (2) holds, we have
[TABLE]
thus we can apply the Poisson approximation given by Proposition 4.3.
- Proof:
The first claim was already shown by [GT16, Lemma 9]. For the second claim, note that
[TABLE]
[TABLE]
Assumption (2), , implies that . The choice of according to Remark 5.2 ensures that can be bounded. Thus and further . ∎
Now, we will use the bounds for the parameter integral to derive an upper bound for the three terms appearing in Proposition 4.3.
Lemma 5.4**.**
There are absolute constants and such that
[TABLE]
for all .
- Proof:
Applying (10) to the definition of in Lemma 4.1 we see that
[TABLE]
Therefore, it follows that
[TABLE]
We now use Fubini’s theorem to re-write the double integral. Together with (10) this implies
[TABLE]
Note that (2), , implies , thus the speed of convergence is dominated by the term with the lowest exponent. Additionally and Remark 5.2 imply, that we can bound and by absolute constants for sufficiently large. Thus there are absolute constants and such that
[TABLE]
for all . Using (10) we obtain in a similar way that
[TABLE]
for all , where and are absolute constants. Applying (10) to the definition of in Lemma 4.1 we see that
[TABLE]
and
[TABLE]
Therefore, it follows that
[TABLE]
for all , where and are absolute constants. Setting completes the proof. ∎
After these preparations, we can now present the proof of our second main result.
- Proof of Theorem 1.4:
We use Proposition 4.3 and the results of the last lemma. Assuming (2) we find absolute constants and such that
[TABLE]
holds for all . Since the first and the last term are converging faster to zero than the second term, thus we can find absolute constants and such that
[TABLE]
holds for all . Using our assumption (2) it follows that and hence as . This completes the proof of Theorem 1.4. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[BP 16] Solesne Bourguin and Giovanni Peccati “The Malliavin-Stein method on the Poisson space” In Stochastic analysis for Poisson point processes 7 , Bocconi Springer Ser. Bocconi Univ. Press, [place of publication not identified], 2016, pp. 185–228
- 2[Bub+16] Sébastien Bubeck, Jian Ding, Ronen Eldan and Miklós Z. Rácz “Testing for high-dimensional geometry in random graphs” In Random Structures Algorithms 49.3 , 2016, pp. 503–532 DOI: 10.1002/rsa.20633 · doi ↗
- 3[Dev+11] Luc Devroye, András György, Gábor Lugosi and Frederic Udina “High-dimensional random geometric graphs and their clique number” In Electron. J. Probab. 16 , 2011, pp. no. 90 \bibrangessep 2481–2508 DOI: 10.1214/EJP.v 16-967 · doi ↗
- 4[GT 16] Jens Grygierek and Christoph Thäle “Gaussian fluctuations for edge counts in high-dimensional random geometric graphs” In ar Xiv e-prints , 2016, pp. ar Xiv:1612.03286 ar Xiv: 1612.03286 [math.PR]
- 5[Las 16] Günter Last “Stochastic analysis for Poisson processes” In Stochastic analysis for Poisson point processes 7 , Bocconi Springer Ser. Bocconi Univ. Press, [place of publication not identified], 2016, pp. 1–36 DOI: 10.1007/978-3-319-05233-5˙1 · doi ↗
- 6[LP 11] Günter Last and Mathew D. Penrose “Poisson process Fock space representation, chaos expansion and covariance inequalities” In Probab. Theory Related Fields 150.3-4 , 2011, pp. 663–690 DOI: 10.1007/s 00440-010-0288-5 · doi ↗
- 7[LP 18] Günter Last and Mathew Penrose “Lectures on the Poisson process” 7 , Institute of Mathematical Statistics Textbooks Cambridge University Press, Cambridge, 2018, pp. xx+293
- 8[LPS 16] Günter Last, Giovanni Peccati and Matthias Schulte “Normal approximation on Poisson spaces: Mehler’s formula, second order Poincaré inequalities and stabilization” In Probab. Theory Related Fields 165.3-4 , 2016, pp. 667–723 DOI: 10.1007/s 00440-015-0643-7 · doi ↗
