Infill asymptotics and bandwidth selection for kernel estimators of spatial intensity functions
M.N.M. van Lieshout

TL;DR
This paper analyzes the asymptotic mean squared error of kernel estimators for spatial intensity functions, deriving optimal bandwidths for both fixed and adaptive estimators under smoothness assumptions.
Contribution
It establishes the asymptotic optimal bandwidth rates for kernel estimators and adaptive estimators of spatial intensity functions, extending existing theory.
Findings
Optimal bandwidth for fixed kernel estimator: n^{-1/(d+4)}
Optimal adaptive bandwidth: n^{-1/(d+8)}
Theoretical foundation for adaptive kernel estimation in spatial processes
Abstract
We investigate the asymptotic mean squared error of kernel estimators of the intensity function of a spatial point process. We show that when independent copies of a point process in are superposed, the optimal bandwidth is of the order under appropriate smoothness conditions on the kernel and true intensity function. We apply the Abramson principle to define adaptive kernel estimators and show that asymptotically the optimal adaptive bandwidth is of the order under appropriate smoothness conditions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
**Infill asymptotics and bandwidth selection for kernel estimators of spatial intensity functions
**
M.N.M. van Lieshout
*CWI, P.O. Box 94079, NL-1090 GB Amsterdam
University of Twente, P.O. Box 217, NL-7500 AE Enschede
The Netherlands
**Abstract
We investigate the asymptotic mean squared error of kernel estimators of the intensity function of a spatial point process. We show that when independent copies of a point process in are superposed, the optimal bandwidth is of the order under appropriate smoothness conditions on the kernel and true intensity function. We apply the Abramson principle to define adaptive kernel estimators and show that asymptotically the optimal adaptive bandwidth is of the order under appropriate smoothness conditions.
**
*Keywords & Phrases: Adaptive kernel estimator; Bandwidth; Infill asymptotics; Intensity function; Kernel estimator; Mean squared error; Point process.
2010 Mathematics Subject Classification: 60G55, 62G07, 60D05. *
1 Introduction
Often the first step in the analysis of a spatial point pattern is to estimate its intensity function. Various non-parametric estimators are available to do so. Some techniques are based on local neighbourhoods of a point, expressed for example by its nearest neighbours [7], its Voronoi [11] or Delaunay tessellation [13, 14]. By far the most popular technique, however, is kernel smoothing [6]. Specifically, let be a point process that is observed in a bounded open subset of and assume that its first order moment measure exists as a -finite Borel measure and is absolutely continuous with respect to Lebesgue measure with a Radon–Nikodym derivative known as its intensity function. A kernel estimator of based on takes the form
[TABLE]
The function is supposed to be kernel, that is, a -dimensional symmetric probability density function [15, p. 13]. The choice of bandwidth determines the amount of smoothing. In principle, the support of as a function of could overlap the complement of . Therefore, various edge corrections have been proposed [2, 9]. In the sequel, though, we will be concerned with very small bandwidths, so this aspect may be ignored.
The aim of this paper is to derive asymptotic expansions for the bias and variance of (1) in terms of the bandwidth. This problem is well known when dealing with probability density functions. Indeed, there exists a vast literature, for example the textbooks [3, 15, 16] and the references therein. In a spatial context, bandwidth selection is dominated by ad hoc [2] and non-parametric methods [5]. The first rigorous study into bandwidth selection to the best of our knowledge is that by Lo [10] who studies infill asymptotics for spatial patterns consisting of independent and identically distributed points. Our goal is to extend his approach to point processes that may exhibit interactions between their points and to investigate adaptive versions thereof.
The plan of this paper is as follows. In Section 2 we focus on the regime in which independent copies of the same point process are superposed and the bandwidth tends to zero as tends to infinity but does not depend on the points of the pattern. We derive Taylor expansions and deduce the asymptotically optimal bandwidth. Intuitively, however, one feels that in sparse regions more smoothing is necessary then in regions that are rich in points. Indeed, in the context of estimating a probability density function, Abramson [1] proposed to scale the bandwidth in proportion to the square root of the density. Analogously, in Section 3 we let decrease in proportion to the square root of the intensity function and show that by doing so the bias can be reduced. For the sake of readability, all proofs are deferred to Section 4.
2 Infill asymptotics
Let be independent and identically distributed point processes for which the first order moment measure exists, is locally finite and admits an intensity function . For , let
[TABLE]
denote the union. Upon taking the limit for , one obtains an asymptotic regime known as ‘infill asymptotics’ [12]. Since the are independent, the intensity function of is . Therefore , , may be estimated by
[TABLE]
Lemma 1
Let be a point process observed in a bounded open subset whose factorial moment measures exist up to second order and are absolutely continuous with intensity function and second order product densities . Let be a kernel. Then the first two moments of (1) are
[TABLE]
and
[TABLE]
The proof follows directly from the definition of product densities, see for example [4, Section 4.3.3]. Provided , the variance of can expressed in terms of the pair correlation function defined by as
[TABLE]
For Poisson processes, the first integral vanishes as .
In this paper, we will restrict ourselves to kernels that belong to the Beta class
[TABLE]
for . Here is the closed unit ball in centred at the origin. The normalising constant will be abbreviated by
[TABLE]
Note that Beta kernels are supported on the compact unit ball and that their smoothness is governed by the parameter . Indeed, the box kernel defined by is constant and therefore continuous on the interior of the unit ball; the Epanechnikov kernel corresponding to the choice is Lipschitz continuous. For the function is times continuously differentiable on .
The following Lemma collects further basic properties of the Beta kernels. The proof can be found in Section 4.1.
Lemma 2
For the Beta kernels , , defined in equation (3), the integrals
[TABLE]
vanish for all such that . Furthermore
[TABLE]
is finite and so are, for all ,
[TABLE]
[TABLE]
as well as, for and ,
[TABLE]
Their values do not depend on the particular choices of and .
For the important special case ,
[TABLE]
Lemma 1 can be used to derive the mean squared error of (2). Its proof can be found in Section 4.2.
Proposition 1
Let be independent and identically distributed point processes observed in a bounded open subset . Assume that their factorial moment measures exist up to second order and are absolutely continuous with strictly positive intensity function and second order product densities . Write for the union, , and let be a Beta kernel (3) with . Then the mean squared error of (2) is given by
[TABLE]
The first term in the above expression is the squared bias. It depends on and the bandwidth but not on . The remaining terms come from the variance and depend on , on , on and on .
Our aim in the remainder of this section is to derive an asymptotic expansion of the mean squared error for bandwidths that depend on in such a way that as . In order to achieve this, first recall some basic facts from analysis. Let be an open subset of and denote by the class of functions for which all order partial derivatives exist and are continuous on . For such functions the order of taking partial derivatives may be interchanged and the Taylor theorem states that if and for all , then a can be found such that
[TABLE]
where is the -tuple and
[TABLE]
for .
We are now ready to state the main result of this section, generalising [10, Theorem 2] for the union of independent random points. The proof can be found in Section 4.2.
Theorem 1
Let be i.i.d. point processes observed in a bounded open subset with well-defined intensity function and pair correlation function . Suppose that is bounded and that is twice continuously differentiable with second order partial derivatives , , that are Hőlder continuous with index on , that is, there exists some such that for all :
[TABLE]
Consider the estimator based on the unions , , and Beta kernel , , with bandwidth chosen in such a way that, as , and . Then, for , as ,
. 2. 2.
.
The bias depends on the second order partial derivatives of the unknown intensity function and on the smoothness parameter . The smoothness of the kernel, measured by , also plays a role. The leading term of the variance depends on and on the smoothness of the kernel.
Theorem 1 readily yields the asymptotically optimal bandwidth, cf. Section 4.2.
Corollary 1
Consider the setting of Theorem 1. Then
[TABLE]
The asymptotic mean squared error is optimised at
[TABLE]
In words, is of the order . Clearly tends to zero as . Moreover, is of the order to the and therefore tends to infinity with . For the special case ,
[TABLE]
The following Proposition generalises [10, Proposition 5]. Its proof can be found in Section 4.2.
Proposition 2
Let be i.i.d. point processes observed in a bounded open subset with well-defined intensity function and pair correlation function . Suppose that is bounded and that is twice continuously differentiable with second order partial derivatives , , that are Hőlder continuous with index on . Consider based on the unions , , and Beta kernel , , with bandwidth chosen in such a way that as , and . Then, for , as ,
[TABLE]
3 Adaptive infill asymptotics
Up to now, estimators based on (1) were considered in which the same bandwidth was applied at every point . However, at least intuitively, it seems clear that the bandwidth should be smaller in regions with many points, larger when points are scarce. This suggests that should be decreasing in .
Motivated by similar considerations in the context of density estimation, Abramson [1] suggested to consider point-dependent bandwidths of the form for equal to the square root of the probability density function. He found that a significant reduction in bias could be obtained by the use of such adaptive bandwidths. Our aim in this section is to show that a similar result holds for spatial intensity function estimation.
Define an estimator
[TABLE]
of , , that is the average of data-adaptive estimators
[TABLE]
As in Section 2, is a kernel and the , , are independent and identically distributed point processes on observed in a bounded non-empty open subset for which the first order moment measure exists and admits an intensity function ; is assumed to be a measurable positive-valued weight function on .
The next result summarises the first two moments.
Lemma 3
Let be a point process observed in a bounded open subset , whose factorial moment measures exist up to second order and are absolutely continuous with intensity function and second order product densities . Let be a kernel. Then the first two moments of (6) are
[TABLE]
and
[TABLE]
The proof follows directly from the definition of product densities, see for example [4, Section 4.3.3]. For the special case , we retrieve Lemma 1.
Provided , the variance of , the average of the , can be expressed in terms of the pair correlation function as
[TABLE]
[TABLE]
We are now ready to state the first main result of this section in analogy to [1, Theorem, p. 1218]. The proof can be found in Section 4.3.
Theorem 2
Let be i.i.d. point processes observed in a bounded open subset with well-defined intensity function and pair correlation function . Suppose that is bounded and that is bounded, bounded away from zero and twice continuously differentiable on with bounded second order partial derivatives , .
Consider the estimator with
[TABLE]
based on the unions , , and Beta kernel , , with bandwidth chosen in such a way that, as , and . Then, for , as ,
. 2. 2.
.
In comparison with Theorem 1, the variance is the same as that for a non-adaptive bandwidth. The bias term on the other hand is of a smaller order. Note that, since the leading bias term is not specified, Theorem 2 cannot be used to calculate an asymptotically optimal bandwidth. To remedy this, stronger smoothness assumptions seem needed.
Theorem 3
Let be i.i.d. point processes observed in a bounded open subset with well-defined intensity function and pair correlation function . Suppose that is bounded and that is bounded, bounded away from zero and five times continuously differentiable on with bounded partial derivatives.
Consider the estimator with
[TABLE]
based on the unions , , and Beta kernel , , with bandwidth chosen in such a way that, as , and . Then, for , as ,
, where
[TABLE]
and . 2. 2.
.
For the important special cases , the expression for may be simplified. All the proofs are given in Section 4.3.
Proposition 3
Consider the framework of Theorem 3 in one dimension . Then the coefficient of in the expansion of is
[TABLE]
where and the superscript indicates the fourth order derivative.
Proposition 4
Consider the framework of Theorem 3 in two dimensions . Then the coefficient of in the expansion of is
[TABLE]
with , and constants
[TABLE]
and
[TABLE]
Theorem 3 immediately yields the asymptotically optimal bandwidth, which should be compared with that in Corollary 1.
Corollary 2
Consider the setting of Theorem 3. Then
[TABLE]
The asymptotic mean squared error is optimised at
[TABLE]
The optimal bandwidth and the weights depend on the unknown intensity function. In practice, a non-parametric pilot estimator (for example the one proposed in [5]) would be plugged in.
To conclude this section, we present the analogue of Proposition 2. The proof can be found in Section 4.3.
Proposition 5
Let be i.i.d. point processes observed in a bounded open subset with well-defined intensity function and pair correlation function . Suppose that is bounded and that is bounded, bounded away from zero and five times continuously differentiable on with bounded partial derivatives. Consider with based on the unions , , and Beta kernel , , with bandwidth chosen in such a way that as , and . Then, for , as ,
[TABLE]
where is as defined in Theorem 3.
4 Proofs and technicalities
4.1 Auxiliary lemmas for the Beta kernel
Proof of Lemma 2: The first two claims follow from the symmetry of the Beta kernel. Furthermore
[TABLE]
Due to the symmetry of the Beta kernel it is clear that the definitions of , and do not depend on the choices of and . First consider the case . By the symmetry of and a change of variables , , it follows that
[TABLE]
Similarly,
[TABLE]
For dimensions , write and as a repeated integral and note that the innermost integral takes the form
[TABLE]
for . By the symmetry and a change of parameters , it follows that
[TABLE]
and
[TABLE]
in accordance with the claim.
Finally for , can be written as
[TABLE]
The inner integral is equal to so
[TABLE]
in accordance with the claim.
In the sequel, the following additional properties of the Beta kernels will be needed.
Lemma 4
Consider the Beta kernels with defined in equation (3). Then, for all ,
[TABLE]
the integrals of second order products in with respect to vanish and for distinct ,
[TABLE]
The integrals of other third order products in with respect to vanish. Finally the following identities hold for all :
[TABLE]
and
[TABLE]
Proof of Lemma 4: The proof relies on partial integrations, which involve evaluations of for where . These take the value zero, as . Therefore
[TABLE]
Similarly
[TABLE]
and
[TABLE]
Hence, for , penultimate equation in the lemma holds. To prove the last equation in the lemma, note that there are contributions of to the left-hand side and one of .
Lemma 5
Consider the Beta kernels with defined in equation (3). Then, for all ,
[TABLE]
and
[TABLE]
Proof of Lemma 5: Apply integration by parts and Lemma 4 to obtain that for all distinct in ,
[TABLE]
The evaluations of products in multiplied by are zero since
[TABLE]
which take the value zero when and . All other integrals of fourth order products in with respect to or vanish.
Consider the two equations to be proven. For , all contributions to the left-hand side of the first equation are zero. For , there are contributions with , of which are of size for and of size for . To this are added contributions when exactly one of is equal to , and one contribution when . Adding them all up gives
[TABLE]
and rearranging terms completes the proof.
Lemma 6
For fixed , the function defined by is, for the Beta kernel with , four times continuously differentiable. The first three derivatives are given by
[TABLE]
and the fourth order derivative is
[TABLE]
[TABLE]
Proof of Lemma 6: For , the function is four times continuously differentiable. The expressions for the derivatives follow by straightforward calculation.
Lemma 7
Consider the Beta kernels with defined in equation (3). Then, for all ,
[TABLE]
[TABLE]
Proof of Lemma 7: The proof relies on repeated integration by parts. The evaluations of at where all take the value zero for .
4.2 Proofs of propositions and theorems: non-adaptive case
Proof of Proposition 1: Since is the average of independent random variables , ,
[TABLE]
and
[TABLE]
Therefore, by Lemma 1,
[TABLE]
and
[TABLE]
[TABLE]
Since is the sum of the squared bias and the variance, the claim is seen to hold.
Proof of of Theorem 1: To prove 1. note that since goes to zero, and is open, for large enough is equal to . For such , by a change of variables, the symmetry of the Beta kernels and the proof of Proposition 1, the bias is
[TABLE]
The intensity can be brought under the integral since is a probability density.
Fix . As for all and is twice continuously differentiable on , the term between curly brackets in the integrand may be expanded as a Taylor series (5) with :
[TABLE]
for some that may depend on . Write
[TABLE]
Now,
[TABLE]
is dominated by
[TABLE]
since . Since was chosen large enough for to lie in , we may use the Hőlder assumption to obtain the inequality
[TABLE]
The right hand side does not depend on the particular choice of nor on . In summary,
[TABLE]
for a remainder term that satisfies .
Returning to the bias (8), for large ,
[TABLE]
By Lemma 2,
[TABLE]
Furthermore,
[TABLE]
because by Lemma 2, the cross terms with are zero. Finally, since is a probability density and is uniformly bounded in ,
[TABLE]
To prove 2. note that, as for the bias, may be chosen large enough for the ball to fall entirely in . For such , by a change of variables and the symmetry of the Beta kernels,
[TABLE]
Fix . As for all and is continuously differentiable on , we may use the Taylor expansion (5) with to write
[TABLE]
for some that may depend on . Since the partial derivatives are continuous and hence bounded on closed balls contained in , say by ,
[TABLE]
for a remainder term that satisfies and consequently
[TABLE]
by Lemma 2. The bound on the remainder term implies that
[TABLE]
so that
[TABLE]
We will now show that the contribution of the interaction structure (through the pair correlation function) to the variance vanishes. Choose so large that . Then, by a change of variables and the symmetry of the Beta kernels, the double integral in Proposition 1 reduces to
[TABLE]
Since the pair correlation is assumed to be bounded on , say , and for all , the double integral can be bounded in absolute value by
[TABLE]
cf. equation (9). The integrand in the right hand side is bounded in absolute value by and therefore the interaction structure contributes to the mean squared error. Upon adding (10),
[TABLE]
The last term is negligible with respect to the middle one, and the proof is complete.
Proof of Corollary 1: By Theorem 1,
[TABLE]
for a remainder term for which there exists a scalar such that for large . Hence
[TABLE]
and the claimed expression for the mean squared error follows. Consequently, the asymptotic mean squared error takes the form
[TABLE]
for some scalars . Equating the derivative with respect to to zero yields
[TABLE]
The second derivative with respect to , is strictly positive, so is the unique minimum. Plugging in the expressions for and completes the proof.
Proof of Proposition 2: Since , and is open, if is large enough then . For such , by Lemma 1,
[TABLE]
can be written as an average of independent random variables
[TABLE]
with . Furthermore, by Theorem 1,
[TABLE]
for a remainder term satisfying for some and large . By Chebychev’s inequality, for all ,
[TABLE]
The upper bound tends to as so that
[TABLE]
To finish the proof, add the bias expansion 1. in Theorem 1.
4.3 Proofs of propositions and theorems: adaptive case
Proof of Theorem 2: To prove 1. note that since goes to zero, , is open and is bounded away from zero, for large enough
[TABLE]
for all . For such , by a change of variables, the symmetry of the Beta kernels and Lemma 3, the bias is equal to
[TABLE]
for the functions , , defined by
[TABLE]
Note that the integral in (11) is compactly supported, say on , a property it inherits from the Beta kernel since is bounded away from zero.
Since we are after the coefficient of and, for , is twice continuously differentiable, we use a Taylor expansion (5) with . Thus, fix . Then
[TABLE]
where the remainder term is
[TABLE]
for some that may depend on . Moreover, can be written as
[TABLE]
Recall that is evaluated at of the form . Since the function is bounded we may restrict ourselves to a compact interval for and on this interval is bounded as and its partial derivatives are bounded too. Moreover, the bound can be chosen uniformly in over the compact set . In summary, there exists a constant such that for all and .
We also need a Taylor expansion (5) with for the function around :
[TABLE]
where the remainder term is
[TABLE]
for some that may depend on . The second order partial derivatives are, for ,
[TABLE]
where we use the notation . On the compact set , the are bounded and so are the since is bounded away from zero and twice continuously differentiable. Hence there exists a constant such that for all .
Our next step is to combine the Taylor series (12) and (13). Write . For large the bias (11) can then be written as
[TABLE]
for some .
We will show that the first and second order terms vanish. By (12)–(13), the first order term is equal to multiplied by
[TABLE]
and vanishes because of Lemma 2 and Lemma 4.
Also by (12)–(13), the second order term reads where
[TABLE]
for some and in . Recall that is compactly supported and that the integrand is bounded. Therefore, by the dominated convergence theorem,
[TABLE]
The first double sum is zero because of Lemma 2 and Lemma 4, the second one because of Lemma 2, Lemma 4 and Lemma 5. By the bounds on the remainder terms and , all other terms in (14) are of the order and the proof is complete.
To prove 2. note that, as for the bias, may be chosen so large that
[TABLE]
For such , by a change of variables and the symmetry of the Beta kernels,
[TABLE]
for the function , , defined by
[TABLE]
Note that the integral in(15) is compactly supported, say on , a property it inherits from the Beta kernel since is bounded away from zero.
Fix . Then by a Taylor expansion (5) with
[TABLE]
for some , with
[TABLE]
Recall that is evaluated at of the form . Since the function is bounded we may restrict ourselves to a compact interval for and on this interval is bounded as and its partial derivatives are bounded too. Moreover, the bound can be chosen uniformly in over the compact set . In summary, there exists a constant such that for all and . Hence, with as before, (15) can be written as
[TABLE]
for a remainder term
[TABLE]
with . By (13),
[TABLE]
As and, for such , ,
[TABLE]
We will finally show that the contribution of the interaction structure (through the pair correlation function) to the variance (7) vanishes. Again, choose so large that
[TABLE]
For such , by a change of variables and the symmetry of the Beta kernels, and writing for an upper bound to the pair correlation function, the integral in the last line in (7) can be bounded in absolute value by
[TABLE]
since the integral is compactly supported and both and are bounded.
Proof of of Theorem 3:
As in the proof or Theorem 2, the bias is given by (11) and the integral involved is supported on a compact set . Since we are after the coefficient of and, for , a Taylor expansions (5) with applies for both and . For the former, is equal to
[TABLE]
up to a remainder term
[TABLE]
for some that may depend on . Since is bounded away from zero and five times continuously differentiable, for all . Similarly, for fixed ,
[TABLE]
where for some in that may depend on . Recall that is evaluated at of the form , . Since the function is bounded we may restrict ourselves to a compact interval for and on this interval is bounded as and its partial derivatives up to fifth order are bounded too. Moreover, the bound can be chosen uniformly in over the compact set . In summary, for and .
Next, plug the Taylor expansions into (11). Then
[TABLE]
[TABLE]
By Theorem 2, the first and second order terms are zero. We will show that the third order term vanishes too. By (16),
[TABLE]
Lemma 6 implies that the first term of is
[TABLE]
which vanishes by the symmetry properties of , Lemma 2 and Lemma 4. By Lemma 6, the second term is a linear combination of integrals of the form
[TABLE]
which vanish because of the symmetry properties of the Beta kernel and integration by parts. Similar arguments apply to the third and last term of , which by Lemma 6 is a linear combination of integrals of the form
[TABLE]
[TABLE]
The coefficient of in (17) reads with as claimed, and does not vanish in general. Finally, by the bounds on the remainder terms and , all other terms in (17) are of the order and the proof is complete.
Proof of Proposition 3: By Theorem 3, the coefficient of is where
[TABLE]
Lemma 6 and Lemma 7 can be used to derive the following equations:
[TABLE]
Hence, upon a rearrangement of terms,
[TABLE]
It remains to calculate and plug in expressions for the derivatives of in terms of the underlying intensity function . Now
[TABLE]
can be plugged into (18) to obtain
[TABLE]
[TABLE]
[TABLE]
[TABLE]
and the claim follows upon a rearrangement of terms.
Proof of Proposition 4: Theorem 3 states that the coefficient of is with an explicit expression for . The non-zero terms in this expression can be reduced by repeated partial integration to a scalar multiple of either or as the integrals of other fourth order products in with respect to vanish by the symmetry properties of the Beta kernel.
The scalar multipliers can be calculated as in Lemma 7: for , integrals with respect to first order partial derivatives reduce to
[TABLE]
integrals with respect to second order partial derivatives reduce to
[TABLE]
[TABLE]
and integrals with respect to third order partial derivatives are reduced as
[TABLE]
and
[TABLE]
[TABLE]
Finally,
[TABLE]
[TABLE]
and
[TABLE]
[TABLE]
Evaluation of the expression for implies the claim by elementary but tedious calculation. For example, the coefficient of arises from terms with these coefficients in
[TABLE]
which, by Lemma 6, is equal to
[TABLE]
The desired coefficients occur when and or when and . Therefore
[TABLE]
so the coefficient of is equal to
[TABLE]
Proof of Corollary 2: By Theorem 3,
[TABLE]
for a remainder term satisfying for large . Hence
[TABLE]
from which the claimed expression for the mean squared error follows. Consequently, the asymptotic mean squared error takes the form
[TABLE]
for some scalars . Equating the derivative with respect to to zero yields
[TABLE]
The second derivative with respect to , is strictly positive, so is the unique minimum. Plugging in the expressions for and completes the proof.
Proof of Proposition 5: Since , and is open, if is large enough the ball centred at with radius is contained in . For such , by Lemma 3,
[TABLE]
can be written as an average of independent random variables
[TABLE]
with . Furthermore, by Theorem 3
[TABLE]
for a remainder term satisfying for some and large . By Chebychev’s inequality, for all ,
[TABLE]
The upper bound tends to as so that
[TABLE]
To finish the proof, add the bias expansion 1. in Theorem 3.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] I.A. Abramson. On bandwidth variation in kernel estimates – A square root law. The Annals of Statistics 10:1217–1223, 1982.
- 2[2] M. Berman and P.J. Diggle. Estimating weighted integrals of the second-order intensity of a spatial point process. Journal of the Royal Statistical Society Series B 51:81–92, 1989.
- 3[3] A.W. Bowman and A. Azzalini. Applied smoothing techniques for data analysis. The kernel approach with S-Plus illustrations . Oxford: University Press, 1997.
- 4[4] S.N. Chiu, D. Stoyan, W.S. Kendall and J. Mecke. Stochastic geometry and its applications , third edition. Chichester: John Wiley & Sons, 2013.
- 5[5] O. Cronie and M.N.M. van Lieshout. A non-model based approach to bandwidth selection for kernel estimators of spatial intensity functions. Biometrika 105:455–462, 2018.
- 6[6] P.J. Diggle. A kernel method for smoothing point process data. Journal of Applied Statistics 34:138–147, 1985.
- 7[7] V. Granville. Estimation of the intensity of a Poisson point process by means of nearest neighbour distances. Statistica Neerlandica 52:112–124, 1998.
- 8[8] P. Hall, T.C. Hu and J.S. Marron. Improved variable window kernel estimates of probability densities. The Annals of Statistics 23:1–10, 1995.
