Quick inference for log Gaussian Cox processes with non-stationary   underlying random fields

Ji\v{r}\'i Dvo\v{r}\'ak; Jesper M{\o}ller; Tom\'a\v{s} Mrkvi\v{c}ka; and Samuel Soubeyrand

arXiv:1903.12035·math.ST·October 10, 2019

Quick inference for log Gaussian Cox processes with non-stationary underlying random fields

Ji\v{r}\'i Dvo\v{r}\'ak, Jesper M{\o}ller, Tom\'a\v{s} Mrkvi\v{c}ka, and Samuel Soubeyrand

PDF

TL;DR

This paper introduces a fast three-step composite likelihood method for non-stationary log Gaussian Cox processes, enabling modeling of spatially varying point patterns with non-stationary mean and covariance functions.

Contribution

It proposes a novel parametric and semiparametric modeling framework for non-stationary LGCPs, addressing limitations of traditional stationary assumptions and estimation methods.

Findings

01

Effective modeling of fish aggregation and biological particle dispersal.

02

Demonstrated computational efficiency of the three-step estimation procedure.

03

Captured complex spatial heterogeneity in real datasets.

Abstract

For point patterns observed in natura, spatial heterogeneity is more the rule than the exception. In numerous applications, this can be mathematically handled by the flexible class of log Gaussian Cox processes (LGCPs); in brief, a LGCP is a Cox process driven by an underlying log Gaussian random field (log GRF). This allows the representation of point aggregation, point vacuum and intermediate situations, with more or less rapid transitions between these different states depending on the properties of GRF. Very often, the covariance function of the GRF is assumed to be stationary. In this article, we give two examples where the sizes (that is, the number of points) and the spatial extents of point clusters are allowed to vary in space. To tackle such features, we propose parametric and semiparametric models of non-stationary LGCPs where the non-stationarity is included in both the mean…

Tables6

Table 1. Table 1: Estimated parameter values of the LGCP fitted to the fish aggregation dataset, together with their central 90%-percentile intervals obtained by a parametric bootstrap approach as discussed in Section 4.1 .

Parameter	Estimate	90%-CI
$γ_{0}$	1.35	[-0.17 , 1.81]
$γ_{1}$	-0.033	[-0.057 , 0.076]
$α$	1.02	[0.10 , 1.22]

Table 2. Table 2: Parameter values used in the simulation study related to the estimation procedure for the fish aggregation dataset.

	Series of simulations
	1	2	3	4	5	6	7	8
$α$	0.25	0.25	0.25	0.25	0.50	0.50	0.50	0.50
$γ_{0}$	1.25	1.50	1.25	1.50	1.25	1.50	1.25	1.50
$γ_{1}$	0	0	-0.03	-0.03	0	0	-0.03	-0.03

Table 3. Table 3: Numbers of rejections (over 1000) in the test of γ 1 = 0 subscript 𝛾 1 0 \gamma_{1}=0 applied to the 8 series of simulations with different significance levels.

Significance	Series of simulations
level	1	2	3	4	5	6	7	8
0.01	6	10	56	57	4	5	42	40
0.05	40	50	236	254	33	23	196	156
0.10	77	101	359	410	62	64	305	284

Table 4. Table 4: Estimation of parameters of the LGCP with space transformation fitted to the particle dispersal dataset. Original parameters are transformed to obtain interpretable quantities. The infection strength gives the expected total number of spores released by the source. The dispersal range parameter is half the mean dispersal distance of spores. The random field scale parameter is the distance at which the auto-correlation is 1 / e 1 𝑒 1/e when there is no space transformation.

Name of transformed parameter	Parameter expression	Estimate	90%-CI
Infection strength	$2 π e^{β_{0}} / β_{1}^{2}$	1760	[1140 , 2820]
Dispersal range parameter	$- 1 / β_{1}$	0.62	[0.44 , 0.98]
Random field standard-deviation	$σ$	0.73	[0.55 , 3.50]
Random field scale	$1 / α$	0.15	[0.02 , 34.3]
Space transformation parameter	$δ$	0.93	[0.42 , 5.54]

Table 5. Table 5: Parameter values used in the simulation study related to the estimation procedure for the particle dispersal dataset.

	Series of simulations
	1	2	3	4	5	6	7	8
$σ$	0.5	2	0.5	2	0.5	2	0.5	2
$1 / α$	0.05	0.05	0.2	0.2	0.05	0.05	0.2	0.2
$δ$	1	1	1	1	10	10	10	10

Table 6. Table 6: Relative estimation accuracy obtained for each parameter using the two or three step procedure. The simulation series are those performed for the particle dispersal study (see Section 5.3 ). The relative estimation accuracy was computed as RMSLE(3-steps procedure)/RMSLE(2-steps procedure), i.e., the ratio of the root mean squared logarithmic errors when using the 3-steps and 2-steps procedure, respectively.

Simulation	RMSLE ratio
series	$2 π e^{β_{0}} / β_{1}^{2}$	$- 1 / β_{1}$	$σ$	$1 / α$	$δ$
1	1.00	1.00	0.18	0.73	0.71
2	1.00	1.00	0.12	0.85	0.75
3	1.00	1.00	0.19	0.89	0.47
4	1.00	1.00	0.08	0.76	1.11
5	1.00	1.00	0.23	0.86	1.18
6	1.00	1.00	0.27	0.42	0.58
7	1.00	1.00	0.20	1.18	1.42
8	1.00	1.00	0.16	0.78	0.78

Equations67

E {N (A)} = \int_{A} ρ (u) d u,

E {N (A)} = \int_{A} ρ (u) d u,

Cov {N (A), N (B)} = \int_{A} \int_{B} ρ (u) ρ (v) {g (u, v) - 1} d u d v .

Cov {N (A), N (B)} = \int_{A} \int_{B} ρ (u) ρ (v) {g (u, v) - 1} d u d v .

ρ (u) = E {Λ (u)}, ρ (u) ρ (v) g (u, v) = E {Λ (u) Λ (v)},

ρ (u) = E {Λ (u)}, ρ (u) ρ (v) g (u, v) = E {Λ (u) Λ (v)},

Λ (u) = ρ (u) Λ_{0} (u), E {Λ_{0} (u)} = 1.

Λ (u) = ρ (u) Λ_{0} (u), E {Λ_{0} (u)} = 1.

ξ (u) = i = 0 \sum I z_{i} (u) β_{i}

ξ (u) = i = 0 \sum I z_{i} (u) β_{i}

c (u, v) = σ (u) r (∥ φ (u) - φ (v) ∥) σ (v) .

c (u, v) = σ (u) r (∥ φ (u) - φ (v) ∥) σ (v) .

lo g Λ (u) = ξ (u) + Y_{0} (u), Y_{0} (u) = - σ^{2} (u) /2 + σ (u) Q {φ (u)}, u \in R^{2},

lo g Λ (u) = ξ (u) + Y_{0} (u), Y_{0} (u) = - σ^{2} (u) /2 + σ (u) Q {φ (u)}, u \in R^{2},

C L_{1} (β) = i = 1 \sum n lo g ρ (x_{i}; β) - \int_{W} ρ (u; β) d u .

C L_{1} (β) = i = 1 \sum n lo g ρ (x_{i}; β) - \int_{W} ρ (u; β) d u .

C L_{2} (β, ν) =

C L_{2} (β, ν) =

\displaystyle-\log\int_{W}\int_{W}1\left(\|u-v\|\leq R\right)\rho\left(u;\beta\right)\rho\left(v;\beta\right)g\left(u,v;\nu\right)\,\mathrm{d}u\,\mathrm{d}v\Big{]}

Λ_{0} (u) = c \in C \sum k_{c} (u),

Λ_{0} (u) = c \in C \sum k_{c} (u),

\int ρ_{C} (c) k_{c} (u) d c = 1, u \in R^{2} .

\int ρ_{C} (c) k_{c} (u) d c = 1, u \in R^{2} .

g (u, v) = 1 + \int k_{c} (u) k_{c} (v) ρ_{C} (c) d c, u, v \in R^{2}, u \neq = v,

g (u, v) = 1 + \int k_{c} (u) k_{c} (v) ρ_{C} (c) d c, u, v \in R^{2}, u \neq = v,

g (u, v) = exp {c (u, v)} .

g (u, v) = exp {c (u, v)} .

C L_{2}^{k} (β, ν_{0}) =

C L_{2}^{k} (β, ν_{0}) =

\displaystyle-\log\int_{W_{k}}\int_{W_{k}}1\left(\|u-v\|\leq R_{k}\right)\rho\left(u;\beta\right)\rho\left(v;\beta\right)g_{0}\left(u,v;\nu_{0}\right)\,\mathrm{d}u\,\mathrm{d}v\Big{]}.

\hat{β} = argmax_{β} C L_{1} (β) .

\hat{β} = argmax_{β} C L_{1} (β) .

\overset{ν}{^}_{0, k} = argmax_{ν_{0}} C L_{2}^{k} (\hat{β}, ν_{0});

\overset{ν}{^}_{0, k} = argmax_{ν_{0}} C L_{2}^{k} (\hat{β}, ν_{0});

\overset{ν}{^}^{joint} = argmax_{ν} C L_{2} (\hat{β}, ν) .

\overset{ν}{^}^{joint} = argmax_{ν} C L_{2} (\hat{β}, ν) .

\int_{W_{k}} \int_{W_{k}} 1 (∥ u - v ∥ \leq R_{k}) h (u, v) g_{0} (∥ u - v ∥; ν_{0}) d u d v

\int_{W_{k}} \int_{W_{k}} 1 (∥ u - v ∥ \leq R_{k}) h (u, v) g_{0} (∥ u - v ∥; ν_{0}) d u d v

\int_{0}^{R_{k}} g_{0} (t; ν_{0}) d H_{k} (t)

\int_{0}^{R_{k}} g_{0} (t; ν_{0}) d H_{k} (t)

σ (u) = γ_{0} + γ_{1} z (u), u \in W,

σ (u) = γ_{0} + γ_{1} z (u), u \in W,

r (t) = exp (- α t), t \geq 0.

r (t) = exp (- α t), t \geq 0.

σ (u) = γ_{0} + γ_{1} z (u) \approx γ_{0} + γ_{1} \overset{ˉ}{F}_{k} \mbox f or u \in W_{k} .

σ (u) = γ_{0} + γ_{1} z (u) \approx γ_{0} + γ_{1} \overset{ˉ}{F}_{k} \mbox f or u \in W_{k} .

φ (u) = lo g (1 + δ ∥ u ∥) ∥ u ∥^{- 1} u,

φ (u) = lo g (1 + δ ∥ u ∥) ∥ u ∥^{- 1} u,

r (t) = exp (- α t), t \geq 0,

r (t) = exp (- α t), t \geq 0,

c (u, v)

c (u, v)

= σ^{2} r (∥ lo g (1 + δ ∥ u ∥) ∥ u ∥^{- 1} u - lo g (1 + δ ∥ v ∥) ∥ v ∥^{- 1} v ∥)

\approx σ^{2} r (∥ lo g (1 + δ \overset{ˉ}{F}_{k}) \overset{ˉ}{F}_{k}^{- 1} u - lo g (1 + δ \overset{ˉ}{F}_{k}) \overset{ˉ}{F}_{k}^{- 1} v ∥)

= σ^{2} exp (- α \overset{ˉ}{F}_{k}^{- 1} lo g (1 + δ \overset{ˉ}{F}_{k}) ∥ u - v ∥) .

P (φ (X) \cap A = \emptyset)

P (φ (X) \cap A = \emptyset)

= E exp {- M (φ^{- 1} (A))} = E exp {- \int_{φ^{- 1} (A)} Λ (u) d u},

\int_{φ^{- 1} (A)} Λ (u) d u = \int_{A} Λ (φ^{- 1} (u)) ∣ D_{φ^{- 1}} (u) ∣ d u,

\int_{φ^{- 1} (A)} Λ (u) d u = \int_{A} Λ (φ^{- 1} (u)) ∣ D_{φ^{- 1}} (u) ∣ d u,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Quick inference for log Gaussian Cox processes with non-stationary underlying random fields

Jiří Dvořák111Department of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic; email: [email protected], Jesper Møller222Department of Mathematical Sciences, Aalborg University, Denmark; email: [email protected], Tomáš Mrkvička333Department of Applied Mathematics and Informatics, Faculty of Economics, University of South Bohemia, České Budějovice, Czech Republic; email: [email protected], and Samuel Soubeyrand 444BioSP, INRA, 84914, Avignon, France; email: [email protected]

(July 31, 2019)

Abstract: For point patterns observed in natura, spatial heterogeneity is more the rule than the exception. In numerous applications, this can be mathematically handled by the flexible class of log Gaussian Cox processes (LGCPs); in brief, a LGCP is a Cox process driven by an underlying log Gaussian random field (log GRF). This allows the representation of point aggregation, point vacuum and intermediate situations, with more or less rapid transitions between these different states depending on the properties of GRF. Very often, the covariance function of the GRF is assumed to be stationary. In this article, we give two examples where the sizes (that is, the number of points) and the spatial extents of point clusters are allowed to vary in space. To tackle such features, we propose parametric and semiparametric models of non-stationary LGCPs where the non-stationarity is included in both the mean function and the covariance function of the GRF. Thus, in contrast to most other work on inhomogeneous LGCPs, second-order intensity-reweighted stationarity is not satisfied and the usual two step procedure for parameter estimation based on e.g. composite likelihood does not easily apply. Instead we propose a fast three step procedure based on composite likelihood. We apply our modelling and estimation framework to analyse datasets dealing with fish aggregation in a reservoir and with dispersal of biological particles.

Keywords: composite likelihood estimation; group dispersal; pair correlation function; spatial point process; spatial random field

2010 MSC: 60G55, 62-07, 62M30

1 Introduction

1.1 Motivation and objective

Inhomogeneity and aggregation (also called clumping or clustering) in spatial point patterns are frequently observed in nature because of spatially-structured covariates and random effects. Being able to infer these characteristics generally contributes to unravel influential underlying processes.

For example, Figure 1 shows the locations of 485 brown rust lesions on a wheat leaf. The lesions were obtained by the dispersal of spores emitted from one mother lesion which we refer to as the point source (the point (0,0) in the figure; further details are given in Section 5). Our aim is to describe how the intensity and the random clumping of lesions depends on the distance to the point source. Another example is given in Figure 2, which shows a part of a dataset consisting of the positions of 706 fish recorded along a boat trace. Here, our aim is to describe how the covariate shown in the figure (the spatially varying depth of water) influences the random organisation of the fish shoals (further details are given in Section 4).

Inhomogeneity and aggregation in spatial point patterns are usually described in terms of the intensity function $\rho$ and the pair correlation function $g$ defined as follows. We consider an observed planar point pattern (such as in Figures 1–2) as a realization of the restriction of a planar point process $X$ to a bounded observation window $W\subset\mathbb{R}^{2}$ . Excluding the case of multiple points, we can view $X$ as a locally finite random subset of $\mathbb{R}^{2}$ . The points in $X$ are called events the points in $X$ , and we denote by $N(A)=\#(X\cap A)$ the number of events within a region $A\subseteq\mathbb{R}^{2}$ (strictly speaking $A$ needs to be a Borel set, but we suppress measure theoretical details; Section 2 provides more details on planar point processes). Now, it is assumed that whenever $A\subset\mathbb{R}^{2}$ is bounded, the expected number of events in $A$ is finite and given by

[TABLE]

and whenever $A,B\subset\mathbb{R}^{2}$ are bounded and disjoint, the covariance between $N(A)$ and $N(B)$ exists and is given by

[TABLE]

Whilst the intensity function describes the inhomogeneity, the pair correlation function describes the positive ( $g>1$ ) or negative ( $g<1$ ) dependence between pairs of counts; generally speaking, $g>1$ corresponds to aggregation of the events.

Thus, with a real dataset such as those shown in Figures 1-2, inhomogeneity and aggregation can be described by estimating $\rho$ and $g$ . This estimation can be tackled by constructing flexible parametric models for $\rho$ and $g$ and evaluating the most likely values for the model parameters. In this article, $\rho$ and $g$ are modelled in the framework of parametric and semiparametric models for spatial Cox processes, where we include first- and second-order non-stationarity into $\rho$ and $g$ , respectively. We develop a fast method for parameter estimation based on only $\rho$ and $g$ , which enables processing large point pattern data sets.

1.2 Spatial Cox processes

Recall that $X$ is a planar Poisson process with intensity function $\rho$ if for any bounded region $A\subset\mathbb{R}^{2}$ , $N(A)$ is Poisson distributed with mean $\int_{A}\rho(u)\,\mathrm{d}u$ , and conditioned on $N(A)=n$ , the $n$ events in $A$ are independent and identically distributed with density proportional to $\rho$ restricted to $A$ . Poisson processes provide models for first order inhomogeneity as specified by $\rho$ , whilst they are not flexible enough for modelling second and higher order spatial dependence properties as reflected, for instance, by the pair correlation function; see e.g. the discussion in Møller and Waagepetersen (2007) and the references therein.

In contrast, Cox processes (Cox, 1955) provide a large and flexible class of models for aggregation, where expressions for $\rho$ and $g$ are available. Recall that $X$ is a planar Cox process driven by a random field $\Lambda:\mathbb{R}^{2}\mapsto[0,\infty)$ if $X$ conditioned on $\Lambda$ is a Poisson process with intensity function $\Lambda$ . For distinct points $u,v\in\mathbb{R}^{2}$ , we have

[TABLE]

provided these functions are locally integrable, that is, the integrals in (1)–(2) are well-defined. Usually, for specific parametric Cox process models (including the models in this paper), we have positive dependence ( $g>1$ ), although it is possible to construct Cox processes where $g(u,v)<1$ for some locations $u,v\in\mathbb{R}^{2}$ . Møller and Waagepetersen (2007) suggested separating first order inhomogeneity specified by $\rho$ from random effects specified by a residual random field $\Lambda_{0}$ such that for all $u\in\mathbb{R}^{2}$ ,

[TABLE]

This allows us to model independently the intensity and the kind of aggregation which is not explained by the intensity. Typically, when covariate functions $z_{1},\ldots,z_{I}:W\mapsto\mathbb{R}$ are available, a log linear intensity function $\xi:=\log\rho$ is assumed:

[TABLE]

where $z_{0}\equiv 1$ corresponds to an intercept and $\beta_{0},\ldots,\beta_{I}$ are real regression parameters. Note that the Cox process driven by $\Lambda_{0}$ has also pair correlation function $g$ . In Møller and Waagepetersen (2007) and subsequent work (e.g. Diggle (2013) and the references therein), $\Lambda_{0}$ is assumed to be stationary, that is, its distribution is invariant under translations in $\mathbb{R}^{2}$ ; this implies that $g(u,v)=g_{0}(u-v)$ depends only on $u-v$ . However, we shall not make this assumption as we will need more flexibility when modelling $\Lambda_{0}$ for datasets as in Figures 1–2.

1.3 Spatial Cox processes with non-stationary underlying random fields

In this paper, we do not assume that $\Lambda_{0}$ is stationary, because we want to allow cluster sizes and extents to depend on spatial location. This point of view was also taken in Mrkvička (2014) and Mrkvička and Soubeyrand (2017) who considered the detection of different types of inhomogeneities by using shot noise Cox process models, as discussed further in Section 2.2. Here, in contrast, we will adopt the Cox process framework, but will extend it to particularly flexible cases.

Consider a log Gaussian Cox process (LGCP), that is, a Cox process as above when the log residual random field $Y_{0}=\log\Lambda_{0}$ is a Gaussian random field (GRF) such that $\mathrm{E}[\exp\{Y_{0}(u)\}]=1$ . Thus, denoting the covariance function of the GRF by $c(u,v)=\mathrm{Cov}\{Y_{0}(u),Y_{0}(v)\}$ and the variance function by $\sigma^{2}(u)=c(u,u)$ , the GRF has mean function $\mathrm{E}\{Y_{0}(u)\}=-\sigma^{2}(u)/2$ . Often (including the cases of this paper) $c\geq 0$ or equivalently $g\geq 1$ (as discussed further in Section 2.2). We will consider models where $\rho(u)$ describes the overall ‘average’ inhomogeneity, and $\sigma(u)=\sqrt{\sigma(u)^{2}}$ controls the spatial varying weights (in the sense of the number of events) of the clumps/clusters. Additionally, inspired by the space transformation approach in Sampson and Guttorp (1992) and subsequent work (e.g. Perrin and Senoussi, 2000; Fouedjio et al., 2015), we also allow the covariance function to be of the form

[TABLE]

Here, $r:[0,\infty)\mapsto[-1,1]$ is a positive semi-definite function with $r(0)=1$ , $\|\cdot\|$ denotes the Euclidean distance in $\mathbb{R}^{2}$ , and $\varphi:\mathbb{R}^{2}\mapsto\mathbb{R}^{2}$ is a bijective mapping. The specification of $\varphi$ and $r$ allows us to model the spatial varying shape and scale of clusters. For instance, we can model elliptical shapes by linear functions $\varphi(u)=Au$ (Sampson and Guttorp, 1992; Møller and Toftager, 2014), and model a varying spatial scale with $\varphi(u)=\|u\|^{\delta-1}Au$ or $\varphi(u)=\log(1+\delta\|u\|)\|u\|^{-1}Au$ , where $A$ is a $2\times 2$ regular matrix and $\delta>0$ ; the latter is illustrated later in Section 5 when $A$ is the identity matrix. Moreover, the geostatistical literature (Cressie, 1993; Chilès and Delfiner, 1999; Stein, 1999) suggests numerous parametric models for the function $r$ which corresponds to the correlation function for a stationary and isotropic random field.

In summary, we let $Q$ be a zero-mean and unit-variance GRF with an isotropic correlation function $r(t)=\mathrm{Cov}\{Q(u),Q(v)\}$ for $t=\|u-v\|$ , and consider

[TABLE]

where $\xi$ is given by (4). Examples of realizations of LGCPs based on such random fields are shown in Figures 8 and 13. To the best of our knowledge, such LGCPs have only been studied in the literature when $\sigma$ is constant and $\varphi$ is the identity mapping.

1.4 Outline and software

The remainder of this paper is organized as follows. Section 2 provides the needed background material on Cox processes and parametric inference procedures, and summarizes the attractive properties of LGCPs. Section 3 presents our proposal for performing rapid estimation of parameters of a LGCPs with non-stationary underlying random field using a three step procedure based on composite likelihoods. Our approach is illustrated in Section 4 in a case study concerning fish aggregation (Figure 2), and in Section 5 in a case study dealing with the dispersal of particles from a point source (Figure 1). Section 6 summarizes our findings in Sections 3–5. Finally, Appendices A–C provide details related to the data in Figure 1 and the space transformation function $\varphi$ .

Computations were carried out with the R statistical software, including the packages spatstat, RandomFields and GET. Codes for running the estimation procedure are available upon request.

2 Background

Henceforth, we assume that $X$ is a Cox process driven by a random intensity function $\Lambda$ as in (3) and a parametric statistical model has been specified in terms of an unknown parameter vector $\theta=(\beta,\nu)$ , where $\beta$ is used to parametrize $\rho(u)=\rho(u;\beta)$ , $\nu$ to parametrize the distribution of $\Lambda_{0}$ , and $\beta$ and $\nu$ are variation independent. Thus, $g(u,v)=g(u,v;\nu)$ depends only on $\nu$ .

2.1 Composite likelihood estimation

Suppose $W\subset\mathbb{R}^{2}$ is a bounded observation window and a realization $X\cap W=\{x_{1},\ldots,x_{n}\}$ has been observed. Due to the computational obstacles (especially if $n$ is large) related to the calculation of the likelihood function, computationally easier approaches for statistical inference based on functional summary statistics, including $\rho(u;\beta)$ and $g(u,v;\nu)$ , together with estimation equations (obtained e.g. by minimum contrast methods, composite likelihoods or Palm likelihoods) have been suggested, see the review in Møller and Waagepetersen (2017). We will concentrate on the first and second order composite likelihoods (Guan, 2006; Waagepetersen, 2007; Tanaka et al., 2008; Waagepetersen and Guan, 2009; Guan et al., 2015; Zhuang, 2015). The first-order log composite likelihood is given by

[TABLE]

This is also known as the Poisson log likelihood and it can be interpreted as the composite log likelihood for binary indicators $I_{k}=1\left(N(C_{k})>0\right)$ of the presence of events in cells $C_{k}$ of an infinitesimal partition of $W$ (Schoenberg, 2005; Waagepetersen, 2007; Guan and Loh, 2007; Møller and Waagepetersen, 2007). Further, for a tuning parameter $R>0$ , the second-order log composite likelihood is given by

[TABLE]

and it can be interpreted as the log composite likelihood for the binary indicators $I_{k}I_{l}$ of simultaneous occurrence of events in $R$ -close pairs of distinct cells $C_{k}$ and $C_{l}$ (Waagepetersen, 2007; Møller and Waagepetersen, 2007).

The following two step procedure is widely used for finding estimates $(\hat{\beta},\hat{\nu})$ of $(\beta,\nu)$ (Waagepetersen and Guan, 2009; Prokešová et al., 2017; Mrkvička et al., 2014). First, $\hat{\beta}$ is obtained by maximizing the first-order log composite likelihood (7); the log-linear form for $\rho$ (see Equation (4)) allows a fast estimation of $\hat{\beta}$ by using spatstat. Second, $\hat{\nu}$ is obtained by maximizing the second-order log composite likelihood $CL_{2}(\hat{\beta},\nu)$ .

2.2 Pros of using log Gaussian Cox processes

The two main classes of Cox processes are log Gaussian Cox processes (LGCPs; see Section 1.3) and shot noise Cox processes (SNCPs; see Møller (2003), Møller and Waagepetersen (2004), and the references therein). For a SNCP $X$ driven by (3), the residual random field is of the form:

[TABLE]

where $C$ is a planar Poisson process with intensity function $\rho_{C}$ , and where $k_{c}$ is a non-negative function such that $\mathrm{E}\{\Lambda_{0}(u)\}=1$ or equivalently

[TABLE]

Assuming $\rho_{c}(u)=\rho(u)k_{c}(u)$ is integrable, then $X$ conditioned on $C$ is distributed as $\cup_{c\in C}X_{c}$ , where $X_{c}$ is a finite Poisson process with intensity function $\rho_{c}$ , and where these Poisson processes for all $c\in C$ are independent. Therefore $C$ is called the centre process and $X_{c}$ the cluster with centre $c\in C$ . Moreover,

[TABLE]

provided the integral exits, and then $g\geq 1$ . In general, unless $\Lambda_{0}$ is stationary or equivalently $\rho_{C}(u)$ is constant and $k_{c}(u)=k_{0}(u-c)$ for all $c,u\in\mathbb{R}^{2}$ , the integrals in (9)–(10) are not expressible on closed form; and even more complicated integrals appear for the third and higher order joint intensities. For example, Mrkvička and Soubeyrand (2017) considered non-stationary SNCPs, for which they needed to evaluate the integrals by numerical methods, and they proposed a Bayesian approach based on a MCMC algorithm. On one hand, their approach provides a detailed analysis of uncertainty and, on the other hand, it requires very time-consuming calculations.

In the aim of proposing fast composite likelihood based estimation procedure for second-order non-stationary Cox processes, we find LGCPs as given by (3) and (6) attractive for the following reasons. First, the pair correlation function is in a simple one-to-one correspondence to the covariance function $c$ of the Gaussian process $\log\Lambda_{0}$ :

[TABLE]

Thus, the distribution of $X$ is uniquely determined by $\rho$ and $g$ . This is in line with the first and second order composite likelihood approach. Second, edge effects are not playing a role as the distribution of $X$ restricted to a bounded observation window $W$ is determined by $\rho$ restricted to $W$ together with the distribution of the Gaussian process $\log\Lambda_{0}$ restricted to $W$ . Finally, a LGCP possesses the following attractive properties, although these will not be exploited in this article: the reduced Palm distributions of a LGCP are also LGCPs with unchanged pair correlation function $g$ (Coeurjolly et al., 2017), and third and higher order intensities are easily expressed in terms of $\rho$ and $g$ (Møller et al., 1998).

3 Parameter estimation and model check for Cox processes driven by non-stationary random fields

Throughout this section, for specificity and simplicity, we assume that $X$ is a LGCP governed by (3) and (6). However, the description of the estimation procedure in Section 3.1 can easily be modified to other cases of parametric models for Cox processes driven by non-stationary random fields, including the case of SNCPs.

3.1 A three step composite likelihood based estimation approach

For parametric models of LGCPs with the non-stationary covariance structure introduced in Section 1.3, particularly in connection to the analysis of the data presented in Sections 4.1 and 5.1, we noticed after various experiments, some of which are discussed in C, that the two step procedure presented above yields strongly biased estimates, in particular for the parameters governing the second-order non-stationarity. This relates to the fact that a nonlinear function $\varphi$ makes it impossible to choose a single value of $R$ that would be appropriate in the whole observation window. In other words, the model is too complex (compared to the size of datasets we consider) to enable reliable estimation of interaction parameters in a single step as performed in the two step estimation procedure. In this section, we therefore suggest a three step procedure based on composite likelihoods that circumvents the non-stationarity issue as detailed below.

We split the dataset with respect to a spatial covariate function, say $f:\mathbb{R}^{2}\to\mathbb{R}$ , which will be used to delineate disjoint areas of $W$ over which the point process $X$ is approximately second-order stationary. The three step procedure has to be adapted to the model of interest and, here, we only provide its skeleton. Let $f_{0}<\ldots<f_{K}$ be a sequence of real numbers, and let $W_{k}=\{u\in W:f_{k-1}\leq f(u)<f_{k}\}$ for $k=1,\ldots,K$ . We have in mind that $f_{0},\ldots,f_{K}$ are chosen such that (i) $X$ is approximately second-order stationary over $W_{k}$ , (ii) $X\cap W_{1},\ldots,X\cap W_{K}$ contain approximately the same numbers of points and (iii) $W=W_{1}\cup\cdots\cup W_{K}$ . The division of the dataset to smaller parts which are assumed to be second-order stationary, which is performed in the proposed procedure, makes the estimation more stable as it can be seen in C. Due to the assumption of second-order stationarity in individual subwindows we can expect that the proposed procedure will be successful in cases of smoothly changing second-order structure.

Now, define the approximate second-order log composite likelihood restricted to $W_{k}$ and based on a second-order stationary assumption by

[TABLE]

Here, each $R_{k}>0$ is a tuning parameter that may depend on $k$ , $g_{0}$ is the pair correlation function obtained by assuming that the covariance structure is stationary (that is, $\sigma$ is constant and $\varphi(u)=u$ ), and $\nu_{0}$ is the corresponding set of parameters (typically the constant value of $\sigma$ and the range parameter arising in the correlation function $r$ ). In addition, suppose the parameter $\nu$ related to the covariance structure is a vector $\nu=(\nu^{\text{split}},\nu^{\text{joint}})$ , where $\nu^{\text{split}}$ contains the components of $\nu$ that will be estimated via the approximate second-order composite likelihood applied to the restricted point processes $X\cap W_{1},\ldots,X\cap W_{K}$ , and $\nu^{\text{joint}}$ contains the components of $\nu$ that will be estimated via the second-order composite likelihood applied to the whole point process $X$ .

Then the estimation procedure consists of the three following steps.

If the first-order intensity $\rho$ is modelled by a function parameterized by $\beta$ (like the log-linear function in Equation (4)), $\hat{\beta}$ is obtained by maximizing the first-order log composite likelihood (7):

[TABLE]

Otherwise, a kernel-smoothing approach is used to estimate $\rho$ . 2. 2.

The part $\nu^{\text{split}}$ of $\nu$ is estimated by firstly estimating $\nu_{0}$ for each $k$ via the maximization of the approximate second-order log composite likelihood (11):

[TABLE]

and by secondly fitting a regression with $\hat{\nu}_{0,k}$ as the response variable and the average value of $f$ in $W_{k}$ , namely $\bar{F}_{k}=\frac{1}{\#{X\cap W_{k}}}\sum_{x\in X\cap W_{k}}f(x_{k})$ , as the explanatory variable to estimate each component of $\nu^{\text{split}}$ . In Sections 4 and 5, we show how these regressions are implemented in practice. 3. 3.

After the transformation of the point pattern and the observation window by $\varphi$ , the remaining part $\nu^{\text{joint}}$ of $\nu$ is estimated by maximizing the second-order log composite likelihood (8) with respect to $\nu$ and by ignoring the estimate obtained for $\nu^{\text{split}}$ :

[TABLE]

Remark 1: If no space deformation is assumed in the model, steps 2 and 3 can be reversed and, in this case, the estimate of $\nu^{\text{joint}}$ might be plugged in for estimating $\nu^{\text{split}}$ .

Remark 2: When applying the presented methodology to a new dataset the user has to make several choices in the modelling step. The case studies in Sections 4 and 5 can provide some guidance. Generally speaking, either a parametric model is assumed for the intensity function (see Section 5) or it is estimated nonparametrically (Section 4); either the autocorrelation function of the GRF is considered to be translation invariant (Section 4) or not (Section 5); either the variance of the GRF is treated as being constant (Section 5) or not (Section 4).

Remark 3: A simulation study assessing the estimation efficiency is generally required to select the components of $\nu$ appearing in $\nu^{\text{joint}}$ and $\nu^{\text{split}}$ , and to select the tuning parameters $K$ , $R_{k}$ and $f_{k}$ .

Remark 4: Using the second-order composite likelihood as if the point pattern is second-order stationary and isotropic provides a computational advantage: the double integral in the composite likelihood criterion (11) is then of the form

[TABLE]

and can be computed as the Lebesgue-Stieltjes integral

[TABLE]

where $H_{k}(t)=\int_{W_{k}}\int_{W_{k}}1\left(\|u-v\|\leq t\right)h(u,v)\,\mathrm{d}u\,\mathrm{d}v$ . Note that $H_{k}(t)$ does not depend on the parameter $\nu_{0}$ . Thus, $H_{k}(t)$ is calculated only once and only the integral $\int_{0}^{R_{k}}g_{0}(t;\nu_{0})\,\mathrm{d}H_{k}(t)$ is computed in each step of the iterative optimization of the composite likelihood criterion.

3.2 Test based on global envelopes of the inhomogeneous $J$ -function

Our fitted models were checked with plots of global envelopes and accompanying tests (Myllymäki et al., 2017; Mrkvička et al., 2017). The global envelopes can be based on various functional summaries of point processes, e.g. topological characteristics, morphological characteristics, and third-order characteristics. These statistics can be viewed as appropriate for model checking because they are not used in the estimation procedure considered above. However, they do not directly reflect the dependence of the point pattern structure on spatial covariates, although such a dependence should be taken into account in model checking of point process models with a spatially changing structure.

To reveal the dependence on a given spatial covariate in connection to a LGCP, we have used the inhomogeneous $J$ -function (Van Lieshout, 2011) computed on several subwindows, which are determined according to the spatial covariate. For example, in the group dispersal model, the subwindows are generated with respect to the distance from the source point emitting the particles. The first subwindow is the inner disk centred around the source point, the second subwindow is the subsequent circular ring, and so on. Note that only the point lying in each subwindow were used for the computation of each inhomogeneous $J$ -function.

4 Fish aggregation

4.1 Data and model

The point pattern studied in this section is part of a larger dataset studied in Mrkvička et al. (2014) and Muška et al. (2016). It consists of the locations of 706 fish observed during a summer day along the trace made by a boat in the middle part of the inland water reservoir Rimov in Czech Republic. The trace is 916 meter long and the boat crossed the river twice. The fish locations were recorded within a distance of 10 to 20 meters from the boat and in a depth of 1 to 1.75 meters. The data we analysed are given by the $(x,y)$ -coordinates of the 3D-locations of fish, where the observation window $W$ is of size $916\times 10$ meters (note that $W$ is larger than the region shown in Figure 2). In fact, as the reservoir is thermally stratified, the majority of fish are in the upper 5 meters of the water column during the summer, and the horizontal distance is considered to be more important than the vertical distance between fish (Jarolím et al., 2010). The aim is to study the effect of a spatial covariate, namely the water depth denoted by $z(u)$ , $u\in W$ , on the distribution of fish shoals and, more exactly, on the dispersion of this distribution.

We used a semiparametric LGCP, with an intensity function estimated in a non-parametric way (as detailed below) and where the ingredients of the covariance function in (5) is specified as follows. The standard-deviation depends linearly on the spatial covariate:

[TABLE]

where $\gamma_{0}$ and $\gamma_{1}$ are real parameters such that $\sigma(u)>0$ for all $u\in W$ . Further, there is no space transformation, that is, $\varphi$ is the identity function. Finally, the auto-correlation function $r$ is exponential with parameter $\alpha>0$ :

[TABLE]

This LGCP can generate point patterns corresponding to non-stationary aggregation of fish, with plenty of small shoals or a few large shoals or something in between depending on the covariate values and the parameter values.

4.2 Implementation of the three step procedure

We adapted the three step estimation procedure of Section 3.1, with $\nu^{\text{split}}=(\gamma_{0},\gamma_{1})$ and $\nu^{\text{joint}}=\alpha$ , and with steps as follows.

Step 1: For the estimation of the intensity function $\rho$ , we used a kernel smoothing method with a large bandwidth (Gaussian kernel with standard deviation equal to 50 meters) in order to allow the estimated $\rho$ function to vary along the boat trace.

Step 2: For the estimation of $\gamma_{0}$ and $\gamma_{1}$ , we started by letting $f\equiv z$ , and defined $K=5$ disjoint subwindows $W_{1},\ldots,W_{5}$ as areas corresponding to different ranges of water depth as determined by the values of $f_{0},\ldots,f_{5}$ . We chose these values such that the subwindows contain approximately the same number of points, e.g. $W_{1}$ contains approximately 141 points lying in the most shallow water. The value of $K$ was chosen as a compromise between the requirement that $f$ is approximately constant on each $W_{k}$ and the necessity to have a reasonable number of points in each $W_{k}$ to allow for reliable estimation. The choices are rather subjective as is the case also for other estimation procedures based on estimating equations and therefore some experimentation is recommended. Then, $\nu_{0,k}=(\sigma_{0,k},\alpha_{0,k})$ was estimated by maximizing the approximate second-order log composite likelihoods based on $X\cap W_{k}$ . Let $\hat{\sigma}_{0,k}$ and $\hat{\alpha}_{0,k}$ denote the estimates of $\sigma_{0,k}$ and $\alpha_{0,k}$ , respectively, and let $\bar{F}_{k}$ denote the average value of $f$ over $W_{k}$ . By assuming that the function $f$ is approximately constant (and approximately equal to $\bar{F}_{k}$ ) over $W_{k}$ , the standard-deviation function $\sigma$ can be approximated by

[TABLE]

Consequently, we used $\hat{\sigma}_{0,k}$ as a natural estimate of $\gamma_{0}+\gamma_{1}\bar{F}_{k}$ . Finally, the linear regression model $\hat{\sigma}_{0,k}=\gamma_{0}+\gamma_{1}\bar{F}_{k}+\epsilon_{k}$ was fitted to data $\{(\hat{\sigma}_{0,k},\bar{F}_{k}):k=1,\ldots,K\}$ and yielded the estimates of $\gamma_{0}$ and $\gamma_{1}$ .

When applying the same value of tuning parameters $R_{k}$ in all subwindows, we observed that the estimates of $\gamma_{0}$ and $\gamma_{1}$ were rather robust with respect to the choice of $R_{k}$ in the range from 6 to 11 meters. In order to include as many pairs of points as possible into the composite likelihood criterion (recall that each subwindow contains only approximately 141 points), we finally decided to use the value $R_{k}=11$ meters for all subwindows.

Step 3: For the estimation of $\alpha$ , we considered $\sigma(u)=\hat{\gamma}_{0}+\hat{\gamma}_{1}Z(u)$ as fixed and maximized the second-order composite likelihood (8) with respect to $\alpha$ only, thereby obtaining an estimate $\hat{\alpha}$ . Note that since $\sigma(u)$ is non-constant, we could not take advantage of the Lebesgue-Stieltjes integral (12) but needed in each iteration step to calculate the double integral in (8). For the choice of the tuning parameter $R$ , preliminary experiments showed that the estimates of $\alpha$ were rather stable for $R$ between 5 and 15 meters. Since the width of the observation window $W$ is only 10 meters, it does not seem reasonable to use a very large value of $R$ . We decided to use the same value as in the second step, that is, $R=11$ meters.

4.3 Results

Having estimated the intensity function in the first step of the estimation procedure, it is straightforward to test the null hypothesis of no interaction between the individual fish by performing the rank envelope test and to construct the envelope based on simulations of the fitted inhomogeneous Poisson process. Figure 3 shows the simultaneous 95% envelope constructed from 9999 simulated curves, when the test statistic $T$ is given by concatenating together the five inhomogeneous $J$ -functions estimated in the subwindows $W_{1},\ldots,W_{5}$ . The data curve (solid black line) leaves the envelope for four of the five subwindows, and the $p$ -value is below 2%, so we reject the null hypothesis of no interaction between the fish. Figure 3 also shows that the level of clustering is bigger for shallower subwindows.

In the second step of the estimation procedure, a linear regression of $\hat{\sigma}_{0,1},\ldots,\hat{\sigma}_{0,5}$ against average water depths, see Figure 4, yields the estimates $\hat{\gamma}_{0}$ and $\hat{\gamma}_{1}$ of the parameters governing the second-order non-stationarity of the point process. By ignoring uncertainties in the estimates $\hat{\sigma}_{0,1},\ldots,\hat{\sigma}_{0,5}$ , the parameter $\gamma_{1}$ is significantly different from zero ( $p$ -value=0.015; value obtained with the R function glm).

Table 1 provides the estimates of the model parameters, including $\hat{\alpha}$ obtained in the third step. The table also specifies central 90%-percentile intervals of the estimates obtained with a parametric bootstrap approach (Efron and Tibshirani, 1993) where we re-estimated the model parameters from each of 1000 simulated point pattern datasets obtained under the fitted LGCP. Using this parametric bootstrap approach here is more demanding than for the particle-dispersal application considered later, since the double integral in (8) needs to be evaluated at each iteration of the maximization procedure required by the 3rd step. A typical computation time took about a minute on a regular desktop computer for datasets with about 700 points.

In addition to Table 1, Figure 5 shows the empirical marginal distributions of the estimates obtained from the 1000 simulated datasets. The regression parameters $\gamma_{0}$ and $\gamma_{1}$ governing the variance structure of the GRF $Y_{0}$ have empirical distributions with a high concentration around their estimated values and a small amount of extremes. The empirical distribution of the scaling parameter $\alpha$ includes a high proportion of values near zero, corresponding to a long-range dependence in the GRF $Q$ and hence also in the LGCP, cf. (6).

Goodness-of-fit of the fitted LGCP was assessed with the global envelope test described in Section 3.2. For this we used 9999 simulations to obtain the envelopes of the concatenated inhomogeneous $J$ -functions computed over the five subwindows $W_{1},\ldots,W_{5}$ . The fitted LGCP model is not rejected by the test at the 5% risk level ( $p$ -interval=[0.083,0.097]); see Figure 6. The central (dotted) curve of the global envelope test reveals a decrease in the level of clustering with respect to increasing water depth.

We investigated the efficiency of our estimation procedure for 8 series of 1000 simulations corresponding to the 8 combinations of parameter values obtained when $\alpha$ , $\gamma_{0}$ , and $\gamma_{1}$ each takes two values, see Table 2. Figure 7 shows examples of the simulated realizations. Note that the function $\sigma(u)$ is constant when $\gamma_{1}=0$ (Series 1, 2, 5, and 6) and varying when $\gamma_{1}=-0.03$ (Series 3, 4, 7, and 8), and our choice of parameter values was in part constrained by the range of the covariate values (5 to 29 meters) and made to avoid negative values for $\sigma(u)$ or values close to zero for $\sigma(0)$ . Further, in the simulations, we used the same observation window $W$ and the same covariate describing the water depth as for the fish aggregation dataset, and also we applied the same estimation procedure. In particular, we used $R_{k}=11$ for all $k$ . Figure 8 provides estimation results for each series of simulations. The regression parameters $\gamma_{0}$ and $\gamma_{1}$ governing the spatially structured variance function of the random field are estimated with negligible bias whatever the value of $\alpha$ , and with rather small variability when the spatial correlation of the random field is small ( $\alpha=0.50$ ; Series 5–8). On the other hand the estimation of $\alpha$ is more difficult. This is not surprising because $\alpha$ is estimated in the third estimation step and relies on the estimated values of $\gamma_{0}$ and $\gamma_{1}$ .

Table 3 shows the results of the test of the hypothesis $\gamma_{1}=0$ based on the linear regression of $\hat{\sigma}_{0,1},\ldots,\hat{\sigma}_{0,5}$ against the covariate used to model $\sigma$ (namely, the average water depths in the application). This test, which is a way to assess the dependence of the second order structure of the LGCP model on the specified covariate, was applied to the 8 series of 1000 simulations with different levels of significance (1%, 5% and 10%). Based on the results for Series 1, 2, 5, and 6, the actual significance level of the test is approximately equal to or lower than the nominal significance level, that is, the test is slightly conservative. The power of the test, assessed with Series 3, 4, 7, and 8, is increased when the spatial auto-correlation is larger at shorter distances (smaller $\alpha$ ), but could certainly be improved with a test better accounting for the propagation of uncertainties in the first two steps of the estimation procedure.

5 Particle dispersal

5.1 Data and model

In this section, we are interested in point patterns generated by the deposit locations of biological windborne particles (such as seeds, pollen grains and pathogenic fungal spores) dispersed from plants. In general, for an isolated plant releasing particles, the point pattern formed by the deposit locations of the particles is denser around the source plant than far from it. In other words, the intensity of points generally decreases with the distance from the source along radial directions (Austerlitz et al., 2004; Rieux et al., 2014; Soubeyrand et al., 2007b; Tufto et al., 1997). This large-scale inhomogeneity may be augmented by small-scale variations due, for instance, to group dispersal (Soubeyrand et al., 2011): groups of particles are transported independently, but the particles of a given group are deposited at dependent locations and form a cluster of points whose spatial extent is smaller at shorter distances (dependencies between the particles of a group vanishes at longer distance because of the accumulation of air turbulence; see A for an extended explanation). In such a case, deposit locations of particles form an inhomogeneous point pattern with spatially varying local aggregation, which may be modelled by a LGCP with a non-stationary auto-correlation function for the GRF. We used this modelling framework and the accompanying estimation procedure to the point pattern in Figure 1 formed by the deposit locations of fungal spores dispersed from a point source located at the origin (details about this data set can be found in Lannou et al., 2008). Here, we assume a log linear intensity as in (4) with $I=1$ , $z_{1}(u)=\|u\|$ , and $(\beta_{0},\beta_{1})\in\mathbb{R}^{2}$ . Such a specification corresponds to the so-called exponential dispersal kernel $u\rightarrow\exp(\beta_{0})\exp(\beta_{1}\|u\|)$ , and it means that the intensity of deposit locations of particles decreases exponentially with distance from the point source when $\beta_{1}<0$ . Let $\sigma(u)\equiv\sigma$ be a constant function, and use the space transformation

[TABLE]

where $\delta$ is a non-negative parameter for modelling an eventual increase in the spatial extent of clusters with the distance from the source point. Finally, assume an exponential correlation function of the GRF:

[TABLE]

where $\alpha$ is a positive parameter.

5.2 Implementation of the three step procedure

We adapted the three step estimation procedure of Section 3.1 to the LGCP described above incorporating the parametric space transformation (14). Here we let $\nu^{\text{split}}=\delta$ and $\nu^{\text{joint}}=(\sigma,\alpha)$ .

Step 1: For the estimation of $\beta_{0}$ and $\beta_{1}$ , we maximized the first-order composite likelihood to obtain estimates $\hat{\beta}_{0}$ and $\hat{\beta}_{1}$ .

Step 2: For the estimation of $\delta$ , we started by defining subwindows $W_{1},\ldots,W_{K}$ based on the distance from the point source by setting $f(u)=\|u\|$ so that $W_{1}$ is the inner disk and $W_{2},\ldots,W_{K}$ are the subsequent rings; more details are given below. Then, for each $k=1,\ldots,K$ , we estimated $\nu_{0,k}=(\sigma_{0,k},\alpha_{0,k})$ by maximizing the approximate second-order log composite likelihoods based on $X\cap W_{k}$ . Let $\hat{\alpha}_{0,k}$ denote the estimate of $\alpha_{0,k}$ . Assuming that the function $f$ is approximately equal to a constant $\bar{F}_{k}$ on each subwindow $W_{k}$ , the covariance function of $Y_{0}$ can be approximated for $u,v\in W_{k}$ by

[TABLE]

Hence, we first estimated $\alpha\bar{F}_{k}^{-1}\log(1+\delta\bar{F}_{k})$ by $\hat{\alpha}_{0,k}$ and second, to obtain an estimate $\hat{\delta}$ of $\delta$ , we fitted the non-linear regression model $\alpha_{0,k}=\alpha\bar{F}_{k}^{-1}\log(1+\delta\bar{F}_{k})+\epsilon_{k}$ to the data $\{(\hat{\alpha}_{0,k},\bar{F}_{k}):k=1,\ldots,K\}$ by minimizing $\sum_{k=1}^{K}|\hat{\alpha}_{0,k}-\alpha\bar{F}_{k}^{-1}\log(1+\delta\bar{F}_{k})|$ with respect to $\alpha$ and $\delta$ . For our dataset, we chose $K=5$ ; since only 5 points then are entering the regression, we have used absolute difference in order to increase robustness of the regression estimate. We applied the following rules of thumb for selecting $f_{k}$ and $R_{k}$ : $f_{0},\ldots,f_{K}$ were selected such that $X\cap W_{1},\ldots,X\cap W_{K}$ contain approximately the same numbers of points and $W=W_{1}\cup\cdots\cup W_{K}$ ; for $k=1,\ldots,K$ , we selected $R_{k}$ such that about 50% of the observed pairs of points in $W_{k}$ were used in the computation of the partial approximate second-order log composite likelihood (11).

Step 3: For the estimation of $\sigma$ and $\alpha$ we used the transformed process $\hat{\varphi}(X)$ where $\hat{\varphi}(u)=\|u\|^{-1}\log(1+\hat{\delta}\|u\|)u$ . The motivation was that, assuming the true value of $\delta$ is known, $\varphi(X)$ is a second-order intensity-reweighted stationary LGCP (Baddeley et al., 2000) with known form of the intensity function and the pair-correlation function. See B for a detailed derivation and Figure 9 for the original pattern and the pattern transformed by $\hat{\varphi}(u)=\|u\|^{-1}\log(1+\hat{\delta}\|u\|)u$ , where $\hat{\delta}$ is given in Table 4. We approximated the intensity function of $\varphi(X)$ in the observation window $\varphi(W)$ by a log-linear function of $\|u\|$ and estimated it by the first-order composite likelihood. We used this estimated intensity function to construct the second-order composite likelihood criterion for estimation of $\sigma$ and $\alpha$ . Again we chose the value of $R$ so that about 50 % of the observed pairs of points were used in the criterion.

5.3 Results

Table 4 provides the estimates of (transformed) model parameters obtained with the three step procedure. It also gives their 90%-percentile intervals obtained with the same parametric bootstrap procedure as in Section 4.3; this was fast: it took typically a few seconds on a regular desktop computer for datasets with about 500 points. Figure 10 shows the marginal distributions of estimates obtained from the 1000 simulated datasets, and it shows the estimated space transformation function $\hat{\varphi}$ and its pointwise 90%- and 50%-confidence envelopes. We notice a rather large estimation variability for second-order parameters $(\sigma,\alpha,\delta)$ . The goodness-of-fit of the model was assessed with the global envelope test described in Section 3.2, using 9999 simulations to obtain the envelopes of the concatenated inhomogeneous $J$ -functions computed over the five subwindows $W_{1},\ldots,W_{5}$ . The fitted LGCP model is not rejected by the test at the 5% risk level ( $p$ -interval=[0.095,0.102]); see Figure 11. The observed curve indicates repulsion in the first two subwindows. This is probably a consequence of the non-negligible size of the particles, an assumption that is not accounted for in our model.

We investigated the efficiency of our estimation procedure for 8 series of 1000 simulations performed with 8 different sets of parameter values. In all series, infection strength and dispersal range were equal to $(2\pi e^{\beta_{0}}/\beta_{1}^{2},-1/\beta_{1})=(1500,0.5)$ , whilst the values of $(\sigma,1/\alpha,\delta)$ are given in Table 5. Figure 12 shows examples of the simulated realizations. Figure 13 provides estimation results and the distribution of the number of points for each simulation series. The estimation of the infection strength and the dispersal range is quite accurate except when the GRF $Y_{0}$ is strongly variable ( $\sigma=2$ ) and its spatial correlation is large ( $1/\alpha=0.2$ ). The latter situation of the GRF can generate large clusters of points in the LGCP away from the source of particles and negatively impact the estimation accuracy of the spatial trend in the intensity of points, which is supposed to be decreasing with the distance from the source of particles. Moreover, if the GRF’s standard-deviation is appropriately estimated, we observe rather large estimation variability and possible biases for $1/\alpha$ and $\delta$ , the parameters arising in the non-stationary auto-correlation function. Overall, the inference is relatively accurate when the GRF’s variance is large ( $\sigma=2$ ) and the spatial deformation is significant ( $\delta=10$ ); i.e., series 6 and 8. In contrast, for a rather flat GRF or when the deformation is not so significant, the estimation of parameters governing the second-order structure is clearly uncertain due to a complex model and the limited amount of information in the data.

In C we compare the performance of the three step estimation method described above with a two step procedure using in the second step the composite likelihood criterion based on the whole observation window $W$ . We conclude that the three-step procedure is in all cases more successful in estimation of $\sigma$ and in almost all cases for $\alpha$ . For the deformation parameter $\delta$ the comparison is less clear but in most cases the three-step procedure results in more precise estimates.

6 Concluding remarks

In this article, we modelled inhomogeneity in spatial point patterns by a Cox process driven by a second-order non-stationary log GRF. We presented a first example where the non-stationarity was obtained by modelling the variance of the GRF as a linear function of an explanatory variable. In a second example, the non-stationarity was obtained via a spatial parametric deformation, which enables the generation of point clusters with spatially varying spatial extents. In this framework, estimating model parameters, including those that govern the second-order non-stationarity, is challenging, especially with a moderate number of points (a few hundreds in our examples). We developed and tested a three step estimation approach based on first- and second-order composite likelihoods. The second-order composite likelihood has to be specifically adapted to the non-stationarity incorporated into the model. The advantages of this estimation procedure are the reduced computation time and the reduced implementation complexity. Reduced computation time allowed us to implement parametric bootstrap for assessing estimation uncertainty, and computationally intensive model check approaches, namely global envelope tests, for assessing goodness-of-fit.

We leave it for future research to examine the theoretical properties of the estimators in the three step estimation procedure. However, the performance of this procedure was evaluated in two simulation studies with settings inspired by the two case studies. Our simulation studies highlight situations where a satisfactory estimation accuracy can be obtained and, conversely, situations where estimation performance is poor. Poor estimation mostly translates into large estimation variance and, from a general point of view, arises when variations in the underlying random field are weak or the spatial deformation is not so significant (due to a complex model and the limited information contained in the data). A part of the estimation poorness can also be attributed to the choice of a three steps procedure, which neglects uncertainty propagation. However, this procedure definitely allows to reduce computation time. Furthermore, it has to be noted that tuning parameters used in the calculation of the second-order composite likelihood have to be appropriately specified to reach satisfactory estimation performance (see Sections 4.2 and 5.2). Generally, one can proceed as follows. Once the model is chosen, the number of subwindows has to be decided. This can be done in order to have in each subwindow sufficiently many points for reliable estimation with the two step procedure for the second-order intensity-reweighted stationary case (on the union of subwindows) and on the other hand to have subwindows small enough to be able to assume second-order intensity-reweighted stationarity within each subwindow. About 100 points in each subwindow could be recommended as a first choice but some experimenting with this choice is wise. Then the maximum distances $R_{k}$ has to be chosen. It is possible to use the same $R_{k}$ values for all subwindows in cases where the interaction distance is homogeneous, cf. Section 4. It is also possible to use $R_{k}$ values such that they take into account a given percentage of the pairs of points in cases where the interaction distance is inhomogeneous, cf. Section 5. Then, in both cases, we recommend to choose the maximum $R_{k}$ which gives stable results, i.e., similar to those obtained with smaller $R_{k}$ . Finally, for the third step we recommend to use the same maximum distance.

In order to graphically explore the dependence of the second order structure on a given covariate we utilized the global envelope test of independence of points locations, which was performed on first order inhomogeneous $J$ -function computed across several subwindows which respect to the value of the given covariate (e.g. Figure 3).

We also introduced an approximate test of dependence of the second order structure on a given covariate based on regression of second order parameters estimated across the subwindows which respect to the value of the given covariate. Our simulation study shows that this test is slightly conservative, which is acceptable in practise.

As discussed in the first paragraph of this section, the two studied examples give two different approaches for modelling complex point process structures. But the way of modelling the second-order non-stationarity is not the only difference. Indeed, the first order intensity is described by a nonparametric model in the fish application, and a parametric model in the particle dispersal application. This choice is related to a generic difficulty in modelling, which has to be carefully investigated for every particular dataset: how to divide data variability between the first order intensity and the second order structure. In the case of a nonparametric first order intensity, this choice is done in the selection of the bandwidth. In the case of a parametric first order intensity, this choice is constrained by the flexibility of the parametric model. For instance, using a small bandwidth in the former case can cause overfitting of the first order intensity with no detection of second order structure; and using a too simple parametric model of the intensity in the latter case can induce a bias, which then impacts the estimation of the second order structure. Such issues generally rely on the usual modelling strategy of the analyst but should also rely on the classical process of statistical analysis consisting of defining a class of models, selecting a model based on the data (including parameter estimation), checking model fit, and possibly re-defining the class of models for starting again the process and finally drawing conclusions (McCullagh and Nelder, 1989, Chapter 12).

Finally, from a prospective point of view, with increasing amount of the gathered data, there is increasing demand for the more complex modelling. The next level for complexity of the presented models is certainly the shift into spatio-temporal modelling framework of e.g. generations of the infection spread or several spatial snapshots of the point pattern gathered in different times. We also leave this issue for future research, noticing that a space time observation window $W\times T$ can be divided in different ways, using sets $W\times T_{k},W_{k}\times T$ or $W_{k}\times T_{j}$ , depending on the properties of the particular dataset.

Appendix A Small-scale variations in particle dispersal

Section 5 deals with point patterns generated by the deposit locations of biological windborne particles dispersed from plants. As indicated in the main text, the intensity of points generally decreases with the distance from the source along radial directions. Nevertheless, in particular cases, the intensity of points along radial directions may be non-monotonic in the vicinity of the source (Stoyan and Wagner, 2001). This leads to a large-scale inhomogeneity in the point pattern formed by the deposit locations of the particles. In addition, the inhomogeneity of the point process may be augmented by small-scale variations due, for instance, to dependencies in the dispersal of particles (Soubeyrand et al., 2011) or to a spatial heterogeneity of the environment (Soubeyrand et al., 2007a).

Consider a situation where aggregation is due to dependencies in the dispersal of particles, cf. Section 5. Soubeyrand et al. (2011) investigated such a situation, in which (i) the particles are released as groups of particles, (ii) the particles of each group are transported together by the wind but the accumulation of air turbulences progressively separate the particles, and (iii) the particles of each group are deposited at nearby locations but the proximity of the deposit locations depends on the accumulation of air turbulences. Typically, the proximity of the deposit locations decreases with the distance separating the source and the final group center location (the larger this distance is, the larger the accumulation of air turbulences and larger the separation of the particles is). A more complete bio-physical justification for such a process is provided in Soubeyrand et al. (2011). In this situation, one expects that the deposit locations of the particles of a group form a cluster whose spatial extent increases with the distance from the source. Consequently, aggregates of points at longer distances from the source should have wider spatial extent. To model this, we can consider a non-stationary auto-correlation function for the zero-mean and unit-variance GRF $Q$ in (6).

In contrast, consider a situation where aggregation is due to a spatial heterogeneity of the environment. If we assume in addition that the spatial heterogeneity is stationary, then the probabilistic properties of aggregates should be the same across space and the auto-correlation function for $Q$ may be stationary.

In the application, the bootstrap analysis is only moderately in favour of the presence of deformation (see bottom right panel of Figure 10, where the 90%-confidence envelope is quite wide). Thus, we cannot conclude that ‘group dispersal occurs’ or that ‘the spatial extent of groups varies with distance at the scale of the study domain if group dispersal occurs’. Replicated data could be helpful for more deeply investigating this issue.

Appendix B Properties of $\varphi(X)$

Let the situation be as in Section 5. This appendix shows that $\varphi(X)$ is a second-order intensity-reweighted stationary LGCP.

Let $\Lambda(u),u\in\mathbb{R}^{2},$ denote the random intensity function of the process $X$ given by $\log\Lambda(u)=\zeta(u)+Y_{0}(u),u\in\mathbb{R}^{2}$ , and let $M$ be the corresponding random intensity measure, that is, $M(A)=\int_{A}\Lambda(u)\,\mathrm{d}u$ for any Borel set $A\subset\mathbb{R}^{2}$ . In the third step of the estimation procedure for the particle dispersal dataset, we consider the transformed process $\varphi(X)$ . To find the distribution of $\varphi(X)$ , we compute void probabilities: For any Borel set $A\subset\mathbb{R}^{2}$ ,

[TABLE]

where $\varphi^{-1}(u)=(\delta\|u\|)^{-1}(e^{\|u\|}-1)u$ for $\varphi$ given by (14). Applying the substitution theorem, we get

[TABLE]

where $D_{\varphi^{-1}}(u)$ is the Jacobian. Hence

[TABLE]

Since the distribution of a locally finite simple point process is uniquely determined by the void probabilities, it follows that $\varphi(X)$ is a Cox process with random intensity function

[TABLE]

The argument of the exponential above is the sum of the stationary and isotropic Gaussian random field $\sigma Q$ and the deterministic trend given by the other terms, i.e., it is a Gaussian random field with non-constant mean but stationary and isotropic covariance structure. Hence $\varphi(X)$ is a second-order intensity-reweighted stationary LGCP (Baddeley et al., 2000) with a tractable form of the intensity function and the pair-correlation function.

Appendix C Comparison of the 2- and 3-steps approaches

Table 6 provides a comparison between the two step procedure from Section 2.1 and our three step procedure from Section 3.1 in the context of the simulation study performed for the particle dispersal application in Section 5.3. The comparison is based on the root mean squared logarithmic error (RMSLE): For a series of estimates $\{\hat{\theta}_{i}:i=1,\ldots,n\}$ of $\theta$ , the RMSLE is equal to $\sqrt{(1/n)\sum_{i=1}^{n}(\log\hat{\theta}_{i}-\log\theta)^{2}}$ , and as in Figure 13, in order to obtain robust assessments, each RMSLE was computed by removing the lower and upper 5% estimates. Here, the logarithmic transformation is used to account for the asymmetric distribution of the estimates of $\sigma$ , $1/\alpha$ , and $\delta$ .

In both procedures, the first step is the same for estimating the first-order parameters $\beta_{0}$ and $\beta_{1}$ (hence, the RMSLE ratio is equal to 1). For estimation of the second-order parameters $\sigma$ , $\alpha$ , and $\delta$ , we notice that the three-steps procedure is more accurate for $\sigma$ , overall for $1/\alpha$ , and to some extent also for $\delta$ .

Acknowledgements

This work was supported by The Danish Council for Independent Research – Natural Sciences, grant DFF – 701400074 ‘Statistics for point processes in space and beyond’, by the ‘Centre for Stochastic Geometry and Advanced Bioimaging’, funded by grant 8721 from the Villum Foundation, by Grant Agency of Czech Republic (Project No. 19-04412S).

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Austerlitz et al. (2004) Austerlitz, F., Dick, C. W., Dutech, C., Klein, E. K., Oddou-Muratorio, S., Smouse, P. E. & Sork, V. L. (2004). Using genetic markers to estimate the pollen dispersal curve. Molecular Ecology 13 , 937–954.
2Baddeley et al. (2000) Baddeley, A. J., Møller, J. & Waagepetersen, R. (2000). Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Statistica Neerlandica 54 , 329–350.
3Chilès and Delfiner (1999) Chilès, J.-P. & Delfiner, P. (1999). Geostatistics. Modeling Spatial Uncertainty . Wiley, New York.
4Coeurjolly et al. (2017) Coeurjolly, J.-F., Møller, J. & Waagepetersen, R. (2017). Palm distributions for log Gaussian Cox processes. Scandinavian Journal of Statistics 44 , 192–203.
5Cox (1955) Cox, D. R. (1955). Some statistical models related with series of events. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 17 , 129–164.
6Cressie (1993) Cressie, N. A. C. (1993). Statistics for Spatial Data . Wiley Series in Probability and Mathematical Statistics, Wiley, revised edition.
7Diggle (2013) Diggle, P. J. (2013). Statistical Analysis of Spatial and Spatio-temporal Point Patterns . Chapman and Hall/CRC, Boca Raton, 3rd edition.
8Efron and Tibshirani (1993) Efron, B. & Tibshirani, R. J. (1993). An Introduction to the Bootstrap . Chapman & Hall, New York.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Quick inference for log Gaussian Cox processes with non-stationary underlying random fields

1 Introduction

1.1 Motivation and objective

1.2 Spatial Cox processes

1.3 Spatial Cox processes with non-stationary underlying random fields

1.4 Outline and software

2 Background

2.1 Composite likelihood estimation

2.2 Pros of using log Gaussian Cox processes

3 Parameter estimation and model check for Cox processes driven by non-stationary random fields

3.1 A three step composite likelihood based estimation approach

3.2 Test based on global envelopes of the inhomogeneous JJJ-function

4 Fish aggregation

4.1 Data and model

4.2 Implementation of the three step procedure

4.3 Results

5 Particle dispersal

5.1 Data and model

5.2 Implementation of the three step procedure

5.3 Results

6 Concluding remarks

Appendix A Small-scale variations in particle dispersal

Appendix B Properties of φ(X)\varphi(X)φ(X)

Appendix C Comparison of the 2- and 3-steps approaches

Acknowledgements

3.2 Test based on global envelopes of the inhomogeneous $J$ -function

Appendix B Properties of $\varphi(X)$