stopp: Methods for spatio-temporal point pattern analysis, simulation,   model fitting, diagnostics, and local analyses

Nicoletta D'Angelo; Giada Adelfio

arXiv:2302.13684·stat.ME·February 28, 2023

stopp: Methods for spatio-temporal point pattern analysis, simulation, model fitting, diagnostics, and local analyses

Nicoletta D'Angelo, Giada Adelfio

PDF

Open Access

TL;DR

The stopp R package provides comprehensive tools for analyzing, modeling, and simulating spatio-temporal point processes on Euclidean spaces and networks, focusing on local characteristics and integrating recent methodological advances.

Contribution

It consolidates various state-of-the-art methods for spatio-temporal point process analysis into a single, extensible R package, facilitating broader application and further development.

Findings

01

Includes functions for summarizing and plotting point processes.

02

Supports modeling, inference, and simulation on networks and Euclidean space.

03

Focuses on local characteristics of spatio-temporal point processes.

Abstract

The stopp R package deals with spatio-temporal point processes which might have occurred on the Euclidean space or on some specific linear networks such as roads of a city. The package contains functions to summarize, plot, and perform different kinds of analyses on point processes, mainly following the methods proposed in some recent papers in the stream of scientific literature. The main topics of such works, and of the package in turn, include modeling, statistical inference, and simulation issues on spatio-temporal point processes on Euclidean space and linear networks, with a focus on their local characteristics. We contribute to the existing literature by collecting many of the most widespread methods for the analysis of spatio-temporal point processes into a unique package, which is intended to welcome many further proposals and extensions.

Equations82

b ((u, t), r, h) = {(v, s) : ∣∣ u - v ∣∣ \leq r, ∣ t - s ∣ \leq h}, (u, t), (v, s) \in W \times T,

b ((u, t), r, h) = {(v, s) : ∣∣ u - v ∣∣ \leq r, ∣ t - s ∣ \leq h}, (u, t), (v, s) \in W \times T,

\mathbb{E}\Bigg{[}\sum_{\zeta_{1},\dots,\zeta_{k}\in X}^{\neq}f(\zeta_{1},\dots,\zeta_{k})\Bigg{]}=\int_{\mathbb{R}^{2}\times\mathbb{R}}\dots\int_{\mathbb{R}^{2}\times\mathbb{R}}f(\zeta_{1},\dots,\zeta_{k})\lambda^{(k)}(\zeta_{1},\dots,\zeta_{k})\prod_{i=1}^{k}\text{d}\zeta_{i},

\mathbb{E}\Bigg{[}\sum_{\zeta_{1},\dots,\zeta_{k}\in X}^{\neq}f(\zeta_{1},\dots,\zeta_{k})\Bigg{]}=\int_{\mathbb{R}^{2}\times\mathbb{R}}\dots\int_{\mathbb{R}^{2}\times\mathbb{R}}f(\zeta_{1},\dots,\zeta_{k})\lambda^{(k)}(\zeta_{1},\dots,\zeta_{k})\prod_{i=1}^{k}\text{d}\zeta_{i},

λ (u, t) = ∣ d u \times d t ∣ \to 0 lim \frac{E [ N ( d u \times d t )]}{∣ d u \times d t ∣},

λ (u, t) = ∣ d u \times d t ∣ \to 0 lim \frac{E [ N ( d u \times d t )]}{∣ d u \times d t ∣},

λ^{(2)} ((u, t), (v, s)) = ∣ d u \times d t ∣, ∣ d v \times d s ∣ \to 0 lim \frac{E [ N ( d u \times d t ) N ( d v \times d s )]}{∣ d u \times d t ∣∣ d v \times d s ∣} .

λ^{(2)} ((u, t), (v, s)) = ∣ d u \times d t ∣, ∣ d v \times d s ∣ \to 0 lim \frac{E [ N ( d u \times d t ) N ( d v \times d s )]}{∣ d u \times d t ∣∣ d v \times d s ∣} .

K (r, h) = 2 π \int_{- r}^{r} \int_{0}^{h} g (r^{'}, h^{'}) r^{'} d r^{'} d h^{'}

K (r, h) = 2 π \int_{- r}^{r} \int_{0}^{h} g (r^{'}, h^{'}) r^{'} d r^{'} d h^{'}

\hat{K} (r, h) = \frac{1}{∣ W ∣∣ T ∣} i = 1 \sum n j > i \sum I (∣∣ u_{i} - u_{j} ∣∣ \leq r, ∣ t_{i} - t_{j} ∣ \leq h) .

\hat{K} (r, h) = \frac{1}{∣ W ∣∣ T ∣} i = 1 \sum n j > i \sum I (∣∣ u_{i} - u_{j} ∣∣ \leq r, ∣ t_{i} - t_{j} ∣ \leq h) .

\mathbb{E}\Bigg{[}\sum_{\zeta_{1},\dots,\zeta_{k}\in X}^{\neq}f(\zeta_{1},\dots,\zeta_{k})\Bigg{]}=\int_{L^{k}}f(\zeta_{1},\dots,\zeta_{k})\lambda^{(k)}(\zeta_{1},\dots,\zeta_{k})\prod_{i=1}^{k}\text{d}\zeta_{i}.

\mathbb{E}\Bigg{[}\sum_{\zeta_{1},\dots,\zeta_{k}\in X}^{\neq}f(\zeta_{1},\dots,\zeta_{k})\Bigg{]}=\int_{L^{k}}f(\zeta_{1},\dots,\zeta_{k})\lambda^{(k)}(\zeta_{1},\dots,\zeta_{k})\prod_{i=1}^{k}\text{d}\zeta_{i}.

\mathbb{E}\Bigg{[}\sum_{(\textbf{u},t),(\textbf{v},s)\in X}^{\neq}f\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\Bigg{]}=\int_{L\times T}\int_{L\times T}f\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\lambda^{(2)}\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\text{d}_{2}(\textbf{u},t)\text{d}_{2}(\textbf{v},s).

\mathbb{E}\Bigg{[}\sum_{(\textbf{u},t),(\textbf{v},s)\in X}^{\neq}f\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\Bigg{]}=\int_{L\times T}\int_{L\times T}f\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\lambda^{(2)}\big{(}(\textbf{u},t),(\textbf{v},s)\big{)}\text{d}_{2}(\textbf{u},t)\text{d}_{2}(\textbf{v},s).

E [N (X \cap A) N (X \cap B)] = \int_{A} \int_{B} λ^{(2)} ((u, t), (v, s)) d_{2} (u, t) d_{2} (v, s), A, B \subseteq L \times T .

E [N (X \cap A) N (X \cap B)] = \int_{A} \int_{B} λ^{(2)} ((u, t), (v, s)) d_{2} (u, t) d_{2} (v, s), A, B \subseteq L \times T .

\int_{L \times T} f (u, t) d_{2} (u, t) = \int_{0}^{\infty} \int_{0}^{\infty} (u, t) \in L \times T : d_{L} (u, v) = r, ∣ t - s ∣ = h \sum f (u, t) d r d h .

\int_{L \times T} f (u, t) d_{2} (u, t) = \int_{0}^{\infty} \int_{0}^{\infty} (u, t) \in L \times T : d_{L} (u, v) = r, ∣ t - s ∣ = h \sum f (u, t) d r d h .

\int_{L \times T} η (d_{L} (u, v), ∣ t - s ∣) d_{2} (u, t) = \int_{0}^{\infty} \int_{0}^{\infty} η (r, h) M ((u, t), r, h) d r d h

\int_{L \times T} η (d_{L} (u, v), ∣ t - s ∣) d_{2} (u, t) = \int_{0}^{\infty} \int_{0}^{\infty} η (r, h) M ((u, t), r, h) d r d h

\hat{λ}_{ϵ, δ}^{(2)} (r, h) = \frac{1}{n - 1} i = 1 \sum n \hat{λ}_{ϵ, δ}^{(2) i} (r, h),

\hat{λ}_{ϵ, δ}^{(2)} (r, h) = \frac{1}{n - 1} i = 1 \sum n \hat{λ}_{ϵ, δ}^{(2) i} (r, h),

\hat{K}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ W ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum I (∣∣ u_{i} - v ∣∣ \leq r, ∣ t_{i} - s ∣ \leq h)

\hat{K}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ W ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum I (∣∣ u_{i} - v ∣∣ \leq r, ∣ t_{i} - s ∣ \leq h)

\hat{K}_{I}^{i} (r, h) = \frac{1}{∣ W ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I ( ∣∣ u _{i} - v ∣∣ \leq r , ∣ t _{i} - s ∣ \leq h )}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s )},

\hat{K}_{I}^{i} (r, h) = \frac{1}{∣ W ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I ( ∣∣ u _{i} - v ∣∣ \leq r , ∣ t _{i} - s ∣ \leq h )}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s )},

\hat{K}_{L, I}^{i} (r, h) = \frac{1}{∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I { d _{L} ( u _{i} , v ) < r , ∣ t _{i} - s ∣ < h }}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s ) M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

\hat{K}_{L, I}^{i} (r, h) = \frac{1}{∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I { d _{L} ( u _{i} , v ) < r , ∣ t _{i} - s ∣ < h }}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s ) M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

\overset{g}{^}_{L, I}^{i} (r, h) = \frac{1}{∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{κ ( d _{L} ( u _{i} , v ) - r ) κ ( ∣ t _{i} - s ∣ - h )}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s ) M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

\overset{g}{^}_{L, I}^{i} (r, h) = \frac{1}{∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{κ ( d _{L} ( u _{i} , v ) - r ) κ ( ∣ t _{i} - s ∣ - h )}{λ ^ ( u _{i} , t _{i} ) λ ^ ( v , s ) M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

D (X) = \frac{n - 1}{∣ L ∣∣ T ∣} i = 1 \sum n i \neq = j \sum \frac{1}{λ ^ ( u _{i} , t _{i} ) λ ^ ( u _{j} , t _{j} )}

D (X) = \frac{n - 1}{∣ L ∣∣ T ∣} i = 1 \sum n i \neq = j \sum \frac{1}{λ ^ ( u _{i} , t _{i} ) λ ^ ( u _{j} , t _{j} )}

\hat{K}_{L}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I { d _{L} ( u _{i} , v ) < r , ∣ t _{i} - s ∣ < h }}{M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

\hat{K}_{L}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{I { d _{L} ( u _{i} , v ) < r , ∣ t _{i} - s ∣ < h }}{M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )},

\overset{g}{^}_{L}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{κ ( d _{L} ( u _{i} , v ) - r ) κ ( ∣ t _{i} - s ∣ - h )}{M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )} .

\overset{g}{^}_{L}^{i} (r, h) = \frac{1}{λ ^ ^{2} ∣ L ∣∣ T ∣} (u_{i}, t_{i}) \neq = (v, s) \sum \frac{κ ( d _{L} ( u _{i} , v ) - r ) κ ( ∣ t _{i} - s ∣ - h )}{M (( u _{i} , t _{i} ) , d _{L} ( u _{i} , v ) , ∣ t _{i} - s ∣ )} .

{H_{0} : H_{1} : no difference in the second-order local structure of (u, t) w.r.t {{x ∖ (u, t)} \cup z} significant difference in the second-order local structure of (u, t) w.r.t {{x ∖ (u, t)} \cup z}

{H_{0} : H_{1} : no difference in the second-order local structure of (u, t) w.r.t {{x ∖ (u, t)} \cup z} significant difference in the second-order local structure of (u, t) w.r.t {{x ∖ (u, t)} \cup z}

T^{i}=\int_{0}^{t_{0}}\int_{0}^{r_{0}}\Big{(}\hat{L}^{(i)}(r,h)-\hat{L}^{-(i)}_{H_{0}}(r,h)\Big{)}^{2}\text{d}r\text{d}h,

T^{i}=\int_{0}^{t_{0}}\int_{0}^{r_{0}}\Big{(}\hat{L}^{(i)}(r,h)-\hat{L}^{-(i)}_{H_{0}}(r,h)\Big{)}^{2}\text{d}r\text{d}h,

λ (u, t) = λ (u) λ (t)

λ (u, t) = λ (u) λ (t)

\mathbb{E}\bigg{[}\int_{W\times T}\hat{\lambda}(\textbf{u},t)d_{2}(\textbf{u},t)\bigg{]}=\int_{W\times T}\lambda(\textbf{u},t)d_{2}(\textbf{u},t)=n,

\mathbb{E}\bigg{[}\int_{W\times T}\hat{\lambda}(\textbf{u},t)d_{2}(\textbf{u},t)\bigg{]}=\int_{W\times T}\lambda(\textbf{u},t)d_{2}(\textbf{u},t)=n,

\hat{λ} (u, t) = \frac{λ ^ ( u ) λ ^ ( t )}{\int _{W \times T} λ ^ ( u , t ) d _{2} ( u , t )} .

\hat{λ} (u, t) = \frac{λ ^ ( u ) λ ^ ( t )}{\int _{W \times T} λ ^ ( u , t ) d _{2} ( u , t )} .

λ (u, t; θ), u \in W, t \in T, θ \in Θ.

λ (u, t; θ), u \in W, t \in T, θ \in Θ.

lo g L (θ) = i \sum λ (u_{i}, t_{i}; θ) - \int_{W} \int_{T} λ (u, t; θ) d t d u

lo g L (θ) = i \sum λ (u_{i}, t_{i}; θ) - \int_{W} \int_{T} λ (u, t; θ) d t d u

λ (u, t; θ) = exp (θ Z (u, t) + B (u, t)), u \in W, t \in T

λ (u, t; θ) = exp (θ Z (u, t) + B (u, t)), u \in W, t \in T

\int_{W} \int_{T} λ (u, t; θ) d t d u \approx k = 1 \sum n + m a_{k} λ (u_{k}, t_{k}; θ)

\int_{W} \int_{T} λ (u, t; θ) d t d u \approx k = 1 \sum n + m a_{k} λ (u_{k}, t_{k}; θ)

lo g L (θ) \approx i \sum lo g λ (x_{i}; θ) + j \sum (1 - λ (u_{j}, t_{j}; θ)) a_{j} = j \sum e_{j} lo g λ (u_{j}, t_{j}; θ) + (1 - λ (u_{j}, t_{j}; θ)) a_{j}

lo g L (θ) \approx i \sum lo g λ (x_{i}; θ) + j \sum (1 - λ (u_{j}, t_{j}; θ)) a_{j} = j \sum e_{j} lo g λ (u_{j}, t_{j}; θ) + (1 - λ (u_{j}, t_{j}; θ)) a_{j}

lo g L (θ) \approx j \sum a_{j} (y_{j} lo g λ (u_{j}, t_{j}; θ) - λ (u_{j}, t_{j}; θ)) + j \sum a_{j} .

lo g L (θ) \approx j \sum a_{j} (y_{j} lo g λ (u_{j}, t_{j}; θ) - λ (u_{j}, t_{j}; θ)) + j \sum a_{j} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPoint processes and geometric inequalities

Full text

stopp: Methods for spatio-

temporal point pattern analysis, simulation, model fitting, diagnostics, and local analyses

by Nicoletta D’Angelo and Giada Adelfio

1 Abstract

The stopp R package deals with spatio-temporal point processes which might have occurred on the Euclidean space or on some specific linear networks such as roads of a city. The package contains functions to summarize, plot, and perform different kinds of analyses on point processes, mainly following the methods proposed in some recent papers in the stream of scientific literature. The main topics of such works, and of the package in turn, include modeling, statistical inference, and simulation issues on spatio-temporal point processes on Euclidean space and linear networks, with a focus on their local characteristics. We contribute to the existing literature by collecting many of the most widespread methods for the analysis of spatio-temporal point processes into a unique package, which is intended to welcome many further proposals and extensions.

2 Introduction

Modelling real problems through space-time point processes is crucial in many scientific and engineering fields such as environmental sciences, meteorology, image analysis, seismology, astronomy, epidemiology and criminology. The growing availability of data is a challenging opportunity for the scientific research, aiming at more detailed information through the application of statistical methodologies suitable for describing complex phenomena.

The aim of the present work is to contribute to the existing literature by gathering many of the most widespread methods for the analysis of spatio-temporal point processes into a unique package, which is intended to host many further extensions. The stopp (R Core Team,, 2022) package provides codes, related to methods and models, for analysing complex spatio-temporal point processes, proposed in the papers Siino et al., 2018a ; Siino et al., 2018b ; Adelfio et al., (2020); D’Angelo et al., (2021); D’Angelo et al., 2022b ; D’Angelo et al., 2022c . The main topics include modelling, statistical inference, and simulation issues on spatial and spatio-temporal point processes, point processes on linear networks, non-Euclidean spaces. The context of application is very broad, as the proposed methods are of interest in describing any phenomenon with a complex spatio-temporal dependence. Some examples, include seismic events (D’Angelo et al., 2022e, ), GPS data (D’Angelo et al., 2022a, ), crimes (D’Angelo et al., 2022d, ), and traffic accidents. Moreover, local methods and models can be applied to different scientific fields and could be suitable for all those phenomena for which it makes sense to hypothesize interdependence in space and time.

The main dependencies of the stopp package are spatstat Baddeley and Turner, (2005), stpp Gabriel and Diggle, (2009), and stlnpp Moradi and Mateu, (2020). In the purely spatial context, spatstat is by far the most comprehensive open-source toolbox for analysing spatial point patterns, focused mainly on two-dimensional point patterns. We exploit many functions from this package when needing purely spatial tools while performing spatio-temporal analyses. Turning to the spatio-temporal context, stpp represents the main reference of statistical tools for analyzing the global and local second-order properties of spatio-temporal point processes, including estimators of the space-time inhomogeneous $K$ -function and pair correlation function. The package is documented in the paper Gabriel et al., (2013). While stpp allows for the simulation of Poisson, inhibitive and clustered patterns, the stppSim (Adepeju,, 2022) package generates artificial spatio-temporal point patterns through the integration of microsimulation and agent-based models. Moreover, splancs (Rowlingson and Diggle,, 2022) fosters many tools for the analysis of both spatial and spatio-temporal point patterns (Rowlingson and Diggle,, 1993; Bivand and Gebhardt,, 2000). Moving to spatio-temporal point patterns on linear networks, the package stlnpp provides tools to visualise and analyse such patterns using the first- and second-order summary statistics developed in Moradi and Mateu, (2020); Mateu et al., (2020). Other worth-to-mention packages dealing with spatio-temporal point pattern analysis include etasFLP Chiodi and Adelfio, (2014), mainly devoted to the estimation of the components of an ETAS (Epidemic Type Aftershock Sequence) model for earthquake description with the non-parametric background seismicity estimated through FLP (Forward Likelihood Predictive) Adelfio and Chiodi, (2020), and SAPP, the Institute of Statistical Mathematics package (Ogata et al.,, 2006; Ogata,, 2006), which provides functions for the statistical analysis of series of events and seismicity. Finally, we highlight some R packages that implement routines to simulate and fit log-Gaussian Cox processes (LGCPs). In particular, the package stpp implements code to simulate spatio-temporal LGCP with a separable and non-separable covariance structure for the Gaussian Random Field (GRF). Instead, the package lgcp Taylor et al., (2015) implements code to fit LGCP models using methods of the moments and a Bayesian inference for spatial, spatio-temporal, multivariate and aggregated point processes. Furthermore, the minimum contrast method is used to estimate parameters assuming a separable structure of the covariance of the GRF. Both packages do not handle for non-separable (and anisotropic) correlation structures of the covariance structure of the GRF.

The outline of the paper is as follows. First, we set the notation of spatio-temporal point processes, both occurring on Euclidean space and on linear networks. Then, we introduce the main functions for handling point processes objects, data, and simulations from different point process models. We then move to the Local Indicators of Spatio-Temporal Association functions, recalling their definition on the spatio-temporal Euclidean space and introducing the new functions to compute the LISTA functions on linear networks. Then, we illustrate how to perform a local test for assessing the local differences in two point patterns occurring on the same metric space. Hence, the functions available in the package for fitting models are illustrated, including separable Poisson process models on both the Euclidean space and networks, global and local non-separable inhomogeneous Poisson processes and LGCPs. Then, methods to perform global and local diagnostics on both models for point patterns on planar and linear network spaces are presented. The paper ends with some conclusions.

3 Spatio-temporal point processes and their second-order properties

We consider a spatio-temporal point process with no multiple points as a random countable subset $X$ of $\mathbb{R}^{2}\times\mathbb{R}$ , where a point $(\textbf{u},t)\in X$ corresponds to an event at $\textbf{u}\in\mathbb{R}^{2}$ occurring at time $t\in\mathbb{R}$ . A typical realisation of a spatio-temporal point process $X$ on $\mathbb{R}^{2}\times\mathbb{R}$ is a finite set $\{(\textbf{u}_{i},t_{i})\}^{n}_{i=1}$ of distinct points within a bounded spatio-temporal region $W\times T\subset\mathbb{R}^{2}\times\mathbb{R}$ , with area $|W|>0$ and length $|T|>0$ , where $n\geq 0$ is not fixed in advance. In this context, $N(A\times B)$ denotes the number of points of a set $(A\times B)\cap X$ , where $A\subseteq W$ and $B\subseteq T$ . As usual (Daley and Vere-Jones,, 2007), when $N(W\times T)<\infty$ with probability 1, which holds e.g. if $X$ is defined on a bounded set, we call $X$ a finite spatio-temporal point process.

For a given event $(\textbf{u},t)$ , the events that are close to $(\textbf{u},t)$ in both space and time, for each spatial distance $r$ and time lag $h$ , are given by the corresponding spatio-temporal cylindrical neighbourhood of the event $(\textbf{u},t)$ , which can be expressed by the Cartesian product as

[TABLE]

where $||\cdot||$ denotes the Euclidean distance in $\mathbb{R}^{2}$ . Note that $b((\textbf{u},t),r,h)$ is a cylinder with centre (u, t), radius $r$ , and height $2h$ .

Product densities $\lambda^{(k)},k\in\mathbb{N}\text{ and }k\geq 1$ , arguably the main tools in the statistical analysis of point processes, may be defined through the so-called Campbell Theorem (see Daley and Vere-Jones, (2007)), that constitutes an essential result in spatio-temporal point process theory, stating that, given a spatio-temporal point process $X$ , for any non-negative function $f$ on $(\mathbb{R}^{2}\times\mathbb{R})^{k}$

[TABLE]

where $\neq$ indicates that the sum is over distinct values. In particular, for $k=1$ and $k=2$ , these functions are respectively called the intensity function $\lambda$ and the (second-order) product density $\lambda^{(2)}$ . Broadly speaking, the intensity function describes the rate at which the events occur in the given spatio-temporal region, while the second-order product densities are used for describing spatio-temporal variability and correlations between pair of points of a pattern. They represent the point process analogues of the mean function and the covariance function of a real-valued process, respectively. Then, the first-order intensity function is defined as

[TABLE]

where $\text{d}\textbf{u}\times\text{d}t$ defines a small region around the point $(\textbf{u},t)$ and $|\text{d}\textbf{u}\times\text{d}t|$ is its volume. The second-order intensity function is given by

[TABLE]

Finally, the pair correlation function $g((\textbf{u},t),(\textbf{v},s))=\frac{\lambda^{(2)}((\textbf{u},t),(\textbf{v},s))}{\lambda(\textbf{u},t)\lambda(\textbf{v},s)}$ can be interpreted formally as the standardised probability density that an event occurs in each of two small volumes, $\text{d}\textbf{u}\times\text{d}t$ and $\text{d}\textbf{v}\times\text{d}s$ , in the sense that for a Poisson process, $g((\textbf{u},t),(\textbf{v},s))=1.$

In this package, the focus is on second-order characteristics of spatio-temporal point patterns, with an emphasis on the $K$ -function (Ripley,, 1976). This is a measure of the distribution of the inter-point distances and captures the spatio-temporal dependence of a point process. A spatio-temporal point process is second-order intensity reweighted stationary and isotropic if its intensity function is bounded away from zero and its pair correlation function depends only on the spatio-temporal difference vector $(r,h)$ , where $r=||\textbf{u}-\textbf{v}||$ and $h=|t-s|$ (Gabriel and Diggle,, 2009). For a second-order intensity reweighted stationary, isotropic spatio-temporal point process, the space-time inhomogeneous $K$ -function takes the form

[TABLE]

where $g(r,h)=\lambda^{(2)}(r,h)/(\lambda(\textbf{u},t)\lambda(\textbf{v},s)),r=||\textbf{u}-\textbf{v}||,h=|t-s|$ (Gabriel and Diggle,, 2009). The simplest expression of an estimator of the spatio-temporal $K$ -function is given as

[TABLE]

For a homogeneous Poisson process $\mathbb{E}[\hat{K}(r,h)]=\pi r^{2}h$ , regardless of the intensity $\lambda$ . The $K$ -function can be used as a measure of spatio-temporal clustering and interaction (Gabriel and Diggle,, 2009; Møller and Ghorbani,, 2012). Usually, $\hat{K}(r,h)$ is compared with the theoretical $\mathbb{E}[\hat{K}(r,h)]=\pi r^{2}h$ . Values $\hat{K}(r,h)>\pi r^{2}h$ suggest clustering, while $\hat{K}(r,h)<\pi r^{2}h$ points to a regular pattern.

Point processes on linear networks are recently considered to analyse events occurring on particular network structures such as the traffic accidents on a road network. Spatial patterns of points along a network of lines are indeed found in many applications. The network might reflect a map of railways, rivers, electrical wires, nerve fibres, airline routes, irrigation canals, geological faults or soil cracks (Baddeley et al.,, 2020). Observations of interest could be the locations of traffic accidents, bicycle incidents, vehicle thefts or street crimes, and many others. A linear network $L=\cup_{i=1}^{n}l_{i}\subset\mathbb{R}^{2}$ is commonly taken as a finite union of line segments $l_{i}\subset\mathbb{R}^{2}$ of positive length. A line segment is defined as $l_{i}=[u_{i},v_{i}]=\{ku_{i}+(1-k)v_{i}:0\leq k\leq 1\}$ , where $u_{i},v_{i}\in\mathbb{R}^{2}$ are the endpoints of $l_{i}$ . For any $i\neq j$ , the intersection of $l_{i}$ and $l_{j}$ is either empty or an endpoint of both segments.

A spatio-temporal linear network point process is a point process on the product space $L\times T$ , where $L$ is a linear network and $T$ is a subset (interval) of $\mathbb{R}$ . We hereafter focus on a spatio-temporal point process $X$ on a linear network $L$ with no overlapping points $(\textbf{u},t)$ , where $\textbf{u}\in L$ is the location of an event and $t\in T(T\subseteq\mathbb{R}^{+})$ is the corresponding time occurrence of u. Note that the temporal state-space $T$ might be either a continuous or a discrete set. A realisation of $X$ with $n$ points is represented by $\textbf{x}={(\textbf{u}_{i},t_{i}),i=1,\dots,n}$ where $(\textbf{u}_{i},t_{i})\in L\times T$ . A spatio-temporal disc with centre $(\textbf{u},t)\in L\times T$ , network radius $r>0$ and temporal radius $h>0$ is defined as $b((\textbf{u},t),r,h)=\{(\textbf{v},s):d_{L}(\textbf{u},\textbf{v})\leq r,|t-s|\leq h\},(\textbf{u},t),(\textbf{v},s)\in L\times T$ where $|\cdot|$ is a numerical distance, and $d_{L}(\cdot,\cdot)$ stands for the appropriate distance in the network, typically taken as the shortest-path distance between any two points. The cardinality of any subset $A\subseteq L\times T,N(X\cap A)\in{0,1,\dots}$ , is the number of points of $X$ restricted to $A$ , whose expected value is denoted by $\nu(A)=\mathbb{E}[N(X\cap A)],A\subseteq L\times T,$ where $\nu$ , the intensity measure of $X$ , is a locally finite product measure on $L\times T$ (Baddeley et al.,, 2006). We now recall Campbell’s theorem for point processes on linear networks (Cronie et al.,, 2020). Assuming that the product densities/intensity functions $\lambda^{(k)}$ exist, for any non-negative measurable function $f(\cdot)$ on the product space $L^{k}$ , we have

[TABLE]

Assume that $X$ has an intensity function $\lambda(\cdot,\cdot)$ , hence Equation (3) reduces to $\mathbb{E}[N(X\cap A)]=\int_{A}\nu(d(\textbf{u},t))=\int_{A}\lambda(\textbf{u},t)d_{2}(\textbf{u},t),A\subseteq L\times T,$ where $d_{2}(\textbf{u},t)$ corresponds to integration over $L\times T$ . The second-order Campbell’s theorem is obtained from (3) with $k=2$

[TABLE]

Assuming that $X$ has a second-order product density function $\lambda^{(2)}(\cdot,\cdot)$ , we then obtain

[TABLE]

Finally, an important result concerns the conversion of the integration over $L\times T$ to that over $\mathbb{R}\times\mathbb{R}$ (Rakshit et al.,, 2017). For any measurable function $f:L\times T\rightarrow\mathbb{R}$

[TABLE]

Letting $f(\textbf{u},t)=\eta(d_{L}(\textbf{u},\textbf{v}),|t-s|)$ then

[TABLE]

where $M((\textbf{u},t),r,h)$ is the number of points lying exactly at the shortest-path distance $r\geq 0$ and the time distance $h\geq 0$ away from $(\textbf{u},t)$ .

4 Main functions for handling point processes objects, data, and simulations

The stp function creates a stp object as a dataframe with three columns: x, y, and t. If the linear network L, of class linnet, is also provided, a stlp object is created instead. The methods for this class of objects: (1) print the main information on the spatio-temporal point pattern stored in the stp object: the number of points, the enclosing spatial window, the temporal time period; (2) print the summary statistics of the spatial and temporal coordinates of the spatio-temporal point pattern stored in the stp object; (3) plot the point pattern stored in the stp object given in input, in a three-panel plot representing the 3Dplot of the coordinates, and the marginal spatial and temporal coordinates.

set.seed(12345) rpp1 <- stpp::rpp(lambda = 200, replace = FALSE) is.stp(rpp1) [1] FALSE

stp1 <- stp(cbind(rpp1 $xyt[, 1], rpp1$ xyt[, 2], rpp1$xyt[, 3])) is.stp(stp1) [1] TRUE

stp1 Spatio-temporal point pattern 208 points Enclosing window: rectangle = [0.0011366, 0.9933775] x [0.0155277, 0.9960438] units Time period: [0.004, 0.997]

Some functions are implemented to convert the stp and stlp classes to those of the stpp and stlnpp packages, and vice-versa.

Moreover, the package is furnished with the greececatalog dataset in the stp format containing the catalog of Greek earthquakes of magnitude at least 4.0 from year 2005 to year 2014, analysed by mean of local log-Gaussian Cox processes in D’Angelo et al., 2022e and D’Angelo et al., 2022c . Data come from the Hellenic Unified Seismic Network (H.U.S.N.). The same data have been analysed in Siino et al., (2017) by hybrids of Gibbs models, and more recently by Gabriel et al., (2022).

plot(greececatalog, tcum = TRUE)

A dataset of crimes occurred in Valencia, Spain, in 2019 is available, together with the linear network of class linnet of the Valencian roads, named valenciacrimes and valencianet, respectively.

Finally, the linear network of class linnet of the roads of Chicago (Illinois, USA) close to the University of Chicago, is also available. It represents the linear network of the Chicago dataset published and analysed in Ang et al., (2012). The network adjacency matrix is stored as a sparse matrix.

Moving to simulations, the rstpp function creates a stp object, simulating a spatio-temporal Poisson point pattern, following either a homogeneous or inhomogeneous intensity.

h1 <- rstpp(lambda = 500, nsim = 1, seed = 2, verbose = TRUE) plot(h1, tcum = TRUE)

inh <- rstpp(lambda = function(x, y, t, a) {exp(a[1] + a[2]*x)}, par = c(2, 6), nsim = 1, seed = 2, verbose = TRUE) plot(inh, tcum = TRUE)

The rstlpp function creates a stlp object instead, simulating a spatio-temporal Poisson point pattern on a linear network. Furthermore, the rETASp function creates a stp object, simulating a spatio-temporal ETAS (Epidemic Type Aftershock Sequence) process. It follows the generating scheme for simulating a pattern from an ETAS process (Ogata and Katsura,, 1988) with conditional intensity function (CIF) as in Adelfio and Chiodi, (2020). The rETAStlp function creates a stlp object, simulating a spatio-temporal ETAS process on a linear network. The simulation scheme previously introduced is adapted for the space location of events to be constrained on a linear network, being firstly introduced and employed for simulation studies in D’Angelo et al., (2021).

5 Local Indicators of Spatio-Temporal Association functions

Local Indicators of Spatio-Temporal Association (LISTA) are a set of functions that are individually associated with each one of the points of the point pattern, and can provide information about the local behaviour of the pattern. This operational definition of local indicators was introduced by Anselin, (1995) for the spatial case, and extended by Siino et al., 2018b to the spatio-temporal context.

If $\lambda^{(2)i}(\cdot,\cdot)$ denotes the local version of the spatio-temporal product density for the event $(\textbf{u}_{i},t_{i})$ , then, for fixed $r$ and $h$ , it holds that

[TABLE]

where $\hat{\lambda}^{(2)i}_{\epsilon,\delta}(r,h)=\frac{n-1}{4\pi r|W\times T|}\sum_{j\neq i}\kappa_{\epsilon,\delta}(||\textbf{u}_{i}-\textbf{v}_{j}||-r,|t_{i}-s_{j}|-h),$ with $r>\epsilon>0$ and $h>\delta>0$ , and $\kappa$ a kernel function with spatial and temporal bandwidths $\epsilon$ and $\delta$ , respectively. Any second-order spatio-temporal summary statistic that satisfies the operational definition in (6), which means that the sum of spatio-temporal local indicator functions is proportional to the global statistic, can be called a LISTA statistic (Siino et al., 2018b, ).

In Adelfio et al., (2020), local versions of both the homogeneous and inhomogeneous spatio-temporal $K$ -functions on the Euclidean space are introduced. Defining an estimator of the overall intensity by $\hat{\lambda}=n/(|W||T|)$ , they propose the local version of (2) for the i-th event $(\textbf{u}_{i},t_{i})$

[TABLE]

and the inhomogeneous version

[TABLE]

with $(\textbf{v},s)$ being the spatial and temporal coordinates of any other point. The authors extended the spatial weighting approach of Veen and Schoenberg, (2006) to spatio-temporal local second-order statistics, proving that the inhomogeneous second-order statistics behave as the corresponding homogeneous ones, basically proving that the expectation of both (7) and (8) is equal to $\pi r^{2}h$ .

5.1 LISTA on linear networks

The functions localSTLKinhom and localSTLginhom implement the inhomogeneous LISTA functions proposed in D’Angelo et al., 2022b . The local spatio-temporal inhomogeneous K-function for the i-th event $(\boldsymbol{u}_{i},t_{i})$ on a linear network is

[TABLE]

and the corresponding local pair correlation function (pcf)

[TABLE]

with

[TABLE]

normalization factor. This leads to the unbiased estimators $\frac{1}{D(X)}\hat{K}^{i}_{L,I}(r,h)$ and $\frac{1}{D(X)}\hat{g}^{i}_{L,I}(r,h)$ .

The homogeneous versions (D’Angelo et al.,, 2021) can be obtained by weighting the second-order summary statistics (either K or pcf) by a constant intensity $\hat{\lambda}=n/(|L||T|)$ , giving

[TABLE]

and

[TABLE]

These can be computed easily with the functions localSTLKinhom and localSTLKginhom, by imputing a lambda vector of constant intensity values, the same for each point.

The proposed functions are the local counterparts of STLKinhom and STLKginhom by Moradi and Mateu, (2020), available in the stlnpp package (Moradi et al.,, 2020).

set.seed(10) X <- stlnpp::rpoistlpp(.2, a = 0, b = 5, L = stlnpp::easynet) lambda <- density(X, at = "points") x <- as.stlp(X) k <- localSTLKinhom(x, lambda = lambda, normalize = TRUE)

select an individual point

j = 1 k[[j]]

plot the lista function and compare it with its theoretical value

inhom <- list(x = k[[j]] $r, y = k[[j]]$ t, z = k[[j]] $Kinhom) theo <- list(x = k[[j]]$ r, y = k[[j]] $t, z = k[[j]]$ Ktheo) diff <- list(x = k[[j]] $r, y = k[[j]]$ t, z = k[[j]] $Kinhom - k[[j]]$ Ktheo) oldpar <- par(no.readonly = TRUE) par(mfrow = c(1, 3)) fields::image.plot(inhom, main= "Kinhom", col = hcl.colors(12, "YlOrRd", rev = FALSE), xlab = "Spatial distance", ylab = "Temporal distance") fields::image.plot(theo, main = "Ktheo", col = hcl.colors(12, "YlOrRd", rev = FALSE), xlab = "Spatial distance", ylab = "Temporal distance") fields::image.plot(diff, main = "Kinhom - Ktheo", col = hcl.colors(12, "YlOrRd", rev = FALSE), xlab = "Spatial distance", ylab = "Temporal distance") par(oldpar)

5.2 Local test for assessing the second-order differences between of two point patterns

The function localtest performs the permutation test of the local structure of spatio-temporal point pattern data, proposed in Siino et al., 2018b . The network counterpart is also implemented, following D’Angelo et al., (2021). This test detects local differences in the second-order structure of two observed point patterns x and z occurring on the same space-time region. This procedure was firstly introduced in Moraga and Montes, (2011) for the purely spatial case, and then extended in the spatio-temporal context by Siino et al., 2018b . Finally, test has been made suitable also for spatio-temporal point patterns with spatial domain coinciding with a linear network by D’Angelo et al., (2021). In general, for each point $(\textbf{u},t)$ in the spatio-temporal observed point pattern x, we test

[TABLE]

The sketch of the test is as follows:

Set $k$ as the number of permutations 2. 2.

For each point $(\textbf{u}_{i},t_{i})\in\textbf{x},i=1,\ldots,n$ :

•

Estimate the LISTA function $\hat{L}^{(i)}(r,h)$

•

Compute the local deviation test

[TABLE]

where $\hat{L}^{-(i)}_{H_{0}}(r,h)$ is the LISTA function for the $i^{th}$ point, averaged over the $j=1,\dots,k$ permutations

•

Compute a $p$ -value as $p^{i}=\sum_{j=1}^{k}\textbf{1}(T^{i,j}_{H_{0}}\geq T^{i})/k$

The test ends providing a vector $p$ of $p$ - values, one for each point in x.

If the test is performed for spatio-temporal point patterns as in Siino et al., 2018b , that is, on an object of class stp, the LISTA functions $\hat{L}^{(i)}$ employed are the local $K$ -functions of Adelfio et al., (2020), computed by the function KLISTAhat of the stpp package (Gabriel et al.,, 2013) . If the function is applied to a stlp object, that is, on two spatio-temporal point patterns observed on the same linear network L, the local $K$ -functions used are the ones proposed in D’Angelo et al., (2021), documented in localSTLKinhom. Details on the performance of the test are found in Siino et al., 2018b and D’Angelo et al., (2021) for Euclidean and network spaces, respectively. Alternative LISTA functions that can be employed to run the test are LISTAhat of stpp and localSTLginhom of stopp, that is, the pcfs on Euclidean space and linear networks respectively.

The methods for this class of objects: (1) print the main information on the result of the local permutation test performed with localtest on either a stp or stlp object: whether the local test was run on point patterns lying on a linear network or not; the number of points in the background X and alternative Z patterns; the number of points in X which exhibit local differences in the second-order structure with respect to Z, according to the performed test; (2) plot the result of the local permutation test performed with localtest: it highlights the points of the background pattern X, which exhibit local differences in the second-order structure with respect to Z, according to the previously performed test. The remaining points of X are also represented; it also shows the underlying linear network, if the local test has been applied to point patterns occurring on the same linear network, that is, if localtest has been applied to a stlp object. In the following, we provide an example of two point processes, both occurring on the unit cube.

background pattern

set.seed(12345) X <- rstpp(lambda = function(x, y, t, a) {exp(a[1] + a[2]*x)}, par = c(.05, 4), nsim = 1, seed = 2, verbose = TRUE)

alternative pattern

set.seed(12345) Z <- rstpp(lambda = 25, nsim = 1, seed = 2, verbose = TRUE)

run the local test

test <- localtest(X, Z, method = "K", k = 9, verbose = FALSE)

test

Test for local differences between two spatio-temporal point patterns

Background pattern X: 17 Alternative pattern Z: 20

1 significant points at alpha = 0.05

plot(test)

6 Model fitting

The description of the observed point pattern intensity is a crucial issue dealing with spatio-temporal point pattern data, and specifying a statistical model is a very effective way compared to analyzing data by calculating summary statistics. Formulating and adapting a statistical model to the data allows taking into account effects that otherwise could introduce distortion in the analysis (Baddeley et al.,, 2015). In this section, we outline the main functions to fit different specifications of inhomogeneous spatio-temporal Poisson process models.

6.0.1 Spatio-temporal Poisson point processes with separable intensity

When dealing with intensity estimation for spatio-temporal point processes, it is quite common to assume that the intensity function $\lambda(\textbf{u},t)$ is separable (Diggle,, 2013; Gabriel and Diggle,, 2009). Under this assumption, the intensity function is given by the product

[TABLE]

where ${\lambda}(\textbf{u})$ and ${\lambda}(t)$ are non-negative functions on $W$ and $T$ , respectively (Gabriel and Diggle,, 2009). Under this assumption, any non-separable effects are interpreted as second-order, rather than first-order. Suitable estimates of $\lambda(\textbf{u})$ and $\lambda(t)$ in (9) depend on the characteristics of each application. The functions here implemented use a combination of a parametric spatial point pattern model, potentially depending on the spatial coordinates and/or spatial covariates, and a parametric log-linear model for the temporal component. Also, non-parametric kernel estimate form(s) are legit but still not implemented. The spatio-temporal intensity is therefore obtained by multiplying the purely spatial and purely temporal intensities, previously fitted separately. The resulting intensity is normalised, to make the estimator unbiased, making the expected number of points

[TABLE]

and the final intensity function is obtained as

[TABLE]

The function sepstppm fits such a separable spatio-temporal Poisson process model. The function plot.sepstppm shows the fitted intensity, displayed both in space and in space and time.

df1 <- valenciacrimes[valenciacrimes $x < 210000 & valenciacrimes$ x > 206000 & valenciacrimes $y < 4377000 & valenciacrimes$ y > 4373000, ]

mod1 <- sepstppm(df1, spaceformula = ~x * y, timeformula = ~ crime_hour + week_day)

For linear network point patterns, non-parametric estimators of the intensity function $\lambda(\cdot,\cdot)$ have been proposed (Mateu et al.,, 2020), suggesting any variation of the distribution of the process over its state-space $L\times T$ . A kernel-based intensity estimator for spatio-temporal linear network point processes, based on the first-order separability assumption, considered in Moradi and Mateu, (2020), is obtainable with the package stnlpp. The functions sepstlppm and plot.sepstlppm implement the network counterparts of the spatio-temporal Poisson point process with separable intensity and fully parametric specification.

mod1 <- sepstlppm(valenciacrimes[1:2500, ], spaceformula = ~x, timeformula = ~ crime_hour + week_day, L = valencianet)

6.0.2 Global inhomogeneous spatio-temporal Poisson processes trough quadrature scheme

For a non-separable spatio-temporal specification, we assume that the template model is a Poisson process, with a parametric intensity or rate function

[TABLE]

The log-likelihood of the template model is

[TABLE]

up to an additive constant, where the sum is over all points $\textbf{u}_{i}$ in the spatio-temporal point process $X$ . We might consider intensity models of log-linear form

[TABLE]

where $Z(\textbf{u},t)$ is a vector-valued covariate function, and $B(\textbf{u},t)$ is a scalar offset. In point process theory, the variables $Z(\textbf{u},t)$ are referred to as spatio-temporal covariates. Their observable values are assumed to be knowable, at least in principle, at each location in the spatio-temporal window. For inferential purposes, their values must be known at each point of the data point pattern and at least at some other locations. This is the reason why we first implmented the dependence of the intensity function $\lambda(\textbf{u},t;\theta)$ on the space and time coordinates first.

The stppm function fits a Poisson process model to an observed spatio-temporal point pattern stored in a stp object, assuming the template model (10).

Estimation is performed by fitting a glm using a spatio-temporal version of the quadrature scheme by Berman and Turner, (1992). We use a finite quadrature approximation to the log-likelihood. Renaming the data points as $\textbf{x}_{1},\dots,\textbf{x}_{n}$ with $(\textbf{u}_{i},t_{i})=\textbf{x}_{i}$ for $i=1,\dots,n$ , then generate $m$ additional ’dummy points’ $(\textbf{u}_{n+1},t_{n+1})\dots,(\textbf{u}_{m+n},t_{m+n})$ to form a set of $n+m$ quadrature points (where $m>n$ ). Then we determine quadrature weights $a_{1},\dots,a_{m}$ so that a Riemann sum can approximate integrals in the log-likelihood

[TABLE]

where $a_{k}$ are the quadrature weights such that $\sum_{k=1}^{n+m}a_{k}=l(W\times T)$ where $l$ is the Lebesgue measure. Then the log-likelihood of the template model can be approximated by

[TABLE]

where $e_{j}=1\{j\leq n\}$ is the indicator that equals $1$ if $u_{j}$ is a data point. Writing $y_{j}=e_{j}/a_{j}$ this becomes

[TABLE]

Apart from the constant $\sum_{j}a_{j}$ , this expression is formally equivalent to the weighted log-likelihood of a Poisson regression model with responses $y_{j}$ and means $\lambda(\textbf{u}_{j},t_{j};\theta)=\exp(\theta Z(\textbf{u}_{j},t_{j})+B(\textbf{u}_{j},t_{j}))$ . This is maximised by this function by using standard GLM software. In detail, we define the spatio-temporal quadrature scheme by considering a spatio-temporal partition of $W\times T$ into cubes $C_{k}$ of equal volume $\nu$ , assigning the weight $a_{k}=\nu/n_{k}$ to each quadrature point (dummy or data) where $n_{k}$ is the number of points that lie in the same cube as the point $u_{k}$ (Raeisi et al.,, 2021). The number of dummy points should be sufficient for an accurate estimate of the likelihood. Following Baddeley et al., (2000) and Raeisi et al., (2021), we start with a number of dummy points $m\approx 4n$ , increasing it until $\sum_{k}a_{k}=l(W\times T)$ .

The AIC.stppm and BIC.stppm functions return the $AIC=2k-2\log(\hat{L})$ and $BIC=k\log{n}-2\log(\hat{L})$ of a point process model fitted through the function stppm applied to an observed spatio-temporal point pattern of class stp. As the model returned by stppm is fitted through a quadrature scheme, the log-likelihood is computed through the quantity

[TABLE]

Homogeneous

set.seed(2) ph <- rstpp(lambda = 200, nsim = 1, seed = 2, verbose = TRUE) hom1 <- stppm(ph, formula = ~ 1)

hom1 Homogeneous Poisson process with Intensity: 202.093

Estimated coefficients: (Intercept) 5.309

plot(hom1) won’t show any plot, due to the constant intensity

coef(hom1) (Intercept) 5.308728

Inhomogeneous

set.seed(2) pin <- rstpp(lambda = function(x, y, t, a) {exp(a[1] + a[2]*x)}, par = c(2, 6), nsim = 1, seed = 2, verbose = TRUE)

inh1 <- stppm(pin, formula = ~ x)

inh1 Inhomogeneous Poisson process with Trend: ~x

Estimated coefficients: (Intercept) x 2.180 5.783

plot(inh1)

6.0.3 Local inhomogeneous spatio-temporal Poisson processes trough local log-likelihood

The locstppm function fits a Poisson process model to an observed spatio-temporal point pattern stored in a stp object, that is, a Poisson model with a set of parameters $\theta_{i}$ for each point $i$ . We assume that the template model is a Poisson process, with a parametric intensity or rate function $\lambda(\textbf{u},t;\theta_{i})$ with space and time localtions $\textbf{u}\in W,t\in T$ and parameters $\theta_{i}\in\Theta.$ Estimation is performed through the fitting of a glm using a localised version of the quadrature scheme by Berman and Turner, (1992), firstly introduced in the purely spatial context by (Baddeley,, 2017), and in the spatio-temporal framework by D’Angelo et al., 2022c .

The local log-likelihood associated with the spatio-temporal location $(\textbf{v},s)$ is given by

[TABLE]

where $w_{\sigma_{s}}$ and $w_{\sigma_{t}}$ are weight functions, and $\sigma_{s},\sigma_{t}>0$ are the smoothing bandwidths. It is not necessary to assume that $w_{\sigma_{s}}$ and $w_{\sigma_{t}}$ are probability densities. For simplicity, we shall consider only kernels of fixed bandwidth, even though spatially adaptive kernels could also be used. Note that if the template model is the homogeneous Poisson process with intensity $\lambda$ , then the local likelihood estimate $\hat{\lambda}(\textbf{v},s)$ reduces to the kernel estimator of the point process intensity with kernel proportional to $w_{\sigma_{s}}w_{\sigma_{t}}$ . We now use an approximation similar to $\log L(\theta)\approx\sum_{j}a_{j}(y_{j}\log\lambda(\textbf{u}_{j},t_{j};\theta)-\lambda(\textbf{u}_{j},t_{j};\theta))+\sum_{j}a_{j},$ but for the local log-likelihood associated with each desired location $(\textbf{v},s)\in W\times T$ , that is:

[TABLE]

where $w_{j}(\textbf{v},s)=w_{\sigma_{s}}(\textbf{v}-\textbf{u}_{j})w_{\sigma_{t}}(s-t_{j})$ . Basically, for each desired location $(\textbf{v},s)$ , we replace the vector of quadrature weights $a_{j}$ by $a_{j}(\textbf{v},s)=w_{j}(\textbf{v},s)a_{j}$ where $w_{j}(\textbf{v},s)=w_{\sigma_{s}}(\textbf{v}-\textbf{u}_{j})w_{\sigma_{t}}(s-t_{j})$ , and use the GLM software to fit the Poisson regression. The local likelihood is defined at any location $(\textbf{v},s)$ in continuous space. In practice, it is sufficient to consider a grid of points $(\textbf{v},s)$ . We refer to D’Angelo et al., 2022c for further discussion on bandwidth selection and on computational costs.

inh00_local <- locstppm(pin, formula = ~ 1)

inh00_local

Homogeneous Poisson process with median Intensity: 7.564067

Summary of estimated coefficients V1 Min. :3.981 1st Qu.:7.291 Median :7.564 Mean :7.316 3rd Qu.:7.669 Max. :7.854

inh01_local <- locstppm(pin, formula = ~ x)

inh01_local

Inhomogeneous Poisson process with Trend: ~x

Summary of estimated coefficients V1 V2 Min. :1.282 Min. :0.7667 1st Qu.:2.634 1st Qu.:4.5470 Median :3.059 Median :5.0662 Mean :3.082 Mean :5.0373 3rd Qu.:3.528 3rd Qu.:5.5636 Max. :4.709 Max. :6.9729

6.0.4 Log-Gaussian Cox processes estimation trough (locally weighted) joint minimum contrast

In the Euclidean context, LGCPs are one of the most prominent clustering models. By specifying the intensity of the process and the moments of the underlying GRF, it is possible to estimate both the first and second-order characteristics of the process. Following the inhomogeneous specification in Diggle et al., (2013), a LGCP for a generic point in space and time has the intensity

[TABLE]

where $S$ is a Gaussian process with $\mathbb{E}(S(\textbf{u},t))=\mu=-0.5\sigma^{2}$ and so $\mathbb{E}(\exp{S(\textbf{u},t)})=1$ and with variance and covariance matrix $\mathbb{C}(S(\textbf{u}_{i},t_{i}),S(\textbf{u}_{j},t_{j}))=\sigma^{2}\gamma(r,h)$ under the stationary assumption, with $\gamma(\cdot)$ the correlation function of the GRF, and $r$ and $h$ some spatial and temporal distances. Following Møller et al., (1998), the first-order product density and the pair correlation function of an LGCP are $\mathbb{E}(\Lambda(\textbf{u},t))=\lambda(\textbf{u},t)$ and $g(r,h)=\exp(\sigma^{2}\gamma(r,h))$ , respectively.

The stlgcppm function estimates a local log-Gaussian Cox process (LGCP), following the locally weighted minimum contrast procedure introduced in D’Angelo et al., 2022c . Three covariances are available: separable exponential, Gneiting, and DeIaco-Cesare. If both the first and second arguments are set to global, a log-Gaussian Cox process is fitted by means of the joint minimum contrast procedure proposed in Siino et al., 2018a .

We may consider a separable structure for the covariance function of the GRF (Brix and Diggle,, 2001) that has exponential form for both the spatial and the temporal components,

[TABLE]

where $\sigma^{2}$ is the variance, $\alpha$ is the scale parameter for the spatial distance and $\beta$ is the scale parameter for the temporal one. The exponential form is widely used in this context and nicely reflects the decaying correlation structure with distance or time.

Moreover, we may consider a non-separable covariance of the GRF useful to describe more general situations. Following the parametrisation in Schlather et al., (2015), Gneiting covariance function (Gneiting et al.,, 2006) can be written as

[TABLE]

where $\varphi(\cdot)$ is a complete monotone function associated to the spatial structure, and $\psi(\cdot)$ is a positive function with a completely monotone derivative associated to the temporal structure of the data. For example, the choice $d=2$ , $\varphi(r)=\sigma^{2}\exp(-(\frac{r}{\alpha})^{\gamma_{s}})$ and $\psi(h)=((\frac{h}{\beta})^{\gamma_{t}}+1)^{\delta/\gamma_{t}}$ yields to the parametric family

[TABLE]

where $\alpha>0$ and $\beta>0$ are scale parameters of space and time, $\delta$ takes values in $(0,2]$ , and $\sigma^{2}$ is the variance.

Another parametric covariance implemented belongs to the Iaco-Cesare family (De Cesare et al.,, 2002; De Iaco et al.,, 2002), and there is a wealth of covariance families that could well be used for our purposes.

Following Siino et al., 2018a , the second-order parameters $\boldsymbol{\psi}$ are found by minimising

[TABLE]

where $\phi(r,h)$ is a weight that depends on the space-time distance and $\nu$ is a transformation function. They suggest $\phi(r,h)=1$ and $\nu$ as the identity function, while $r_{max}$ and $h_{max}$ are selected as 1/4 of the maximum observable spatial and temporal distances.

Following D’Angelo et al., 2022c , we can fit a localised version of the LGCP, that is, obtain a vector of parameters $\boldsymbol{\psi}_{i}$ for each point $i$ , by minimising

[TABLE]

is the average of the local functions $\hat{J}_{i}(r,h)$ , weighted by some point-wise kernel estimates. In particular, we consider $\hat{J}_{i}(\cdot)$ as the local spatio-temporal pair correlation function (Gabriel et al.,, 2013) documented in LISTAhat.

The print and summary functions give the main information on the fitted model. In case of local parameters (both first- and second-order), the summary function contains information on their distributions. Next, we perform and example with a complex seismic point pattern.

data("greececatalog")

If both first and second arguments are set to "global", a log-Gaussian Cox process is fitted by means of the joint minimum contrast.

lgcp1 <- stlgcppm(greececatalog, formula = ~ 1, first = "global", second = "global") lgcp1

Joint minimum contrast fit for a log-Gaussian Cox process with global first-order intensity and global second-order intensity

Homogeneous Poisson process with Intensity: 0.00643

Estimated coefficients of the first-order intensity: (Intercept) -5.046

Covariance function: separable

Estimated coefficients of the second-order intensity: sigma alpha beta 6.989 0.225 156.353

Model fitted in 1.014 minutes

If first = "local", local parameters for the first-order intensity are provided. In this case, the summary function contains information on their distributions.

lgcp2 <- stlgcppm(greececatalog, formula = ~ x, first = "local", second = "global") lgcp2

Joint minimum contrast fit for a log-Gaussian Cox process with local first-order intensity and global second-order intensity

Inhomogeneous Poisson process with Trend: ~x

Summary of estimated coefficients of the first-order intensity (Intercept) x Min. :-6.400 Min. :-0.90689 1st Qu.:-2.526 1st Qu.:-0.38710 Median : 2.333 Median :-0.26876 Mean : 2.153 Mean :-0.26744 3rd Qu.: 5.070 3rd Qu.:-0.06707 Max. :16.323 Max. : 0.10822

Covariance function: separable

Estimated coefficients of the second-order intensity: sigma alpha beta 2.612 0.001 36.415

Model fitted in 3.634 minutes

The plot function shows the fitted intensity, displayed both in space (by means of a density kernel smoothing) and in space and time. In the case of local covariance parameters, the function returns the mean of the random intensity, displayed both in space (by means of a density kernel smoothing) and in space and time. The localsummary.stlgcppm function breaks up the contribution of the local estimates to the fitted intensity, by plotting the overall intensity and the density kernel smoothing of some artificial intensities obtained by imputing the quartiles of the local parameters’ distributions. Finally, the function localplot.stlgcppm function plots the local estimates. In the case of local covariance parameters, the function displays the local estimates of the chosen covariance function.

7 Diagnostics

Inhomogeneous second-order statistics can be constructed and used for assessing the goodness-of-fit of fitted first-order intensities. Nevertheless, it is a widespread practice in the statistical analysis of spatial and spatio-temporal point pattern data primarily comparing the data with a homogeneous Poisson process, which is generally the null model in applications for the fitted model. Indeed, when dealing with diagnostics in point processes, often two steps are needed: the transformation of data into residuals (thinning or rescaling (Schoenberg,, 2003)) and the use of tests to assess the consistency of the residuals with the homogeneous Poisson process (Adelfio and Schoenberg,, 2009). Usually, second-order statistics estimated for the residual process (i.e. the result of a thinning or rescaling procedure) are analysed. Essentially, to each observed point a weight inversely proportional to the conditional intensity at that point is given. This method was adopted by Veen and Schoenberg, (2006) in constructing a weighted version of the $K$ -function of Ripley and Kelly, (1977); the resulting weighted statistic is in many cases more powerful than residual methods (Veen and Schoenberg,, 2006).

The spatio-temporal inhomogeneous version of the $K$ -function in (2) is given by Gabriel and Diggle, (2009) as

[TABLE]

where $\lambda(\cdot,\cdot)$ is the first-order intensity at an arbitrary point. We know that $\mathbb{E}[\hat{K}_{I}(r,h)]=\pi r^{2}h$ , that is the same as the expectation of $\hat{K}(r,h)$ in (2), when the intensity used for the weighting is the true generator model. This is a crucial result that allows the use of the weighted estimator $\hat{K}_{I}(r,h)$ as a diagnostic tool, for assessing the goodness-of-fit of spatio-temporal point processes with generic first-order intensity functions. Indeed, if the weighting intensity function is close to the true one $\lambda(\textbf{u},t)$ , the expectation of $\hat{K}_{I}(r,h)$ should be close to $\mathbb{E}[\hat{K}(r,h)]=\pi r^{2}h$ for the Poisson process. For instance, values $\hat{K}_{I}(r,h)$ greater than $\pi r^{2}h$ indicates that the fitted model is not appropriate, since the distances computed among points exceed the Poisson theoretical ones.

The globaldiag function performs global diagnostics of a model fitted for the first-order intensity of an spatio-temporal point pattern, using the spatio-temporal inhomogeneous K-function (Gabriel and Diggle,, 2009) documented by the function STIKhat of the stpp package (Gabriel et al.,, 2021). It can also perform global diagnostics of a model fitted for the first-order intensity of an spatio-temporal point pattern on a linear network, by means of the spatio-temporal inhomogeneous K-function on a linear network (Moradi and Mateu,, 2020) documented by the function STLKinhom of the stlnpp package (Moradi et al.,, 2020). They both return the plots of the inhomogeneous K-function weighted by the provided intensity to diagnose, its theoretical value, and their difference.

globaldiag(greececatalog, lgcp1$l) [1] "Sum of squared differences = 318213525081.852"

globaldiag(greececatalog, lgcp2$l) [1] "Sum of squared differences = 147029066885.741"

Moving to the local diagnostics, Adelfio et al., (2020) derived the expectation of the local inhomogeneous spatio-temporal K-function, under the Poisson case: $\mathbb{E}[\hat{K}^{i}(r,h)]=\pi r^{2}h.$ Moreover, they found that when the local estimator is weighted by the true intensity function, its expectation, $\mathbb{E}[\hat{K}_{I}^{i}(r,h)]$ , is the same as the expectation of $\hat{K}^{i}(r,h)$ . These results motivate the usage of such local estimator $\hat{K}_{I}^{i}(r,h)$ as a diagnostic tool for general spatio-temporal point processes for assessing the goodness-of-fit of spatio-temporal point processes of any generic first-order intensity function $\lambda$ . Indeed, if the estimated intensity function used for weighting in our proposed LISTA functions is the true one, then the LISTA functions should behave as the corresponding ones of a homogeneous Poisson process, resulting in small discrepancies between the two. Therefore, this function computes such discrepancies by means of the $\chi_{i}^{2}$ values, obtained following the expression

[TABLE]

one for each point in the point pattern. Basically, departures of the LISTA functions $\hat{K}_{I}^{i}(r,h)$ from the Poisson expected value $rh$ directly suggest the unsuitability of the intensity function $\lambda(\cdot)$ used in the weighting of the LISTA functions for that specific point. This can be referred to as an outlying point. Given that D’Angelo et al., 2022b proved the same results for the network case, that is, $\mathbb{E}[\hat{K}_{L}^{i}(r,h)]=rh$ and $\mathbb{E}[\hat{K}_{L,I}^{i}(r,h)]=\mathbb{E}[\hat{K}_{L}^{i}(r,h)]$ when $\hat{K}_{L,I}^{i}(r,h)$ is weighted by the true intensity function, we implemented the same above-mentioned diagnostics procedure to work on intensity functions fitted on spatio-temporal point patterns occurring on linear networks. Note that the Euclidean procedure is implemented by means of the local K-functions of Adelfio et al., (2020), documented in KLISTAhat of the stpp package (Gabriel et al.,, 2013). The network case uses the local K-functions on networks (D’Angelo et al., 2022b, ), documented in localSTLKinhom.

The localdiag function performs local diagnostics of a model fitted for the first-order intensity of an spatio-temporal point pattern, by means of the local spatio-temporal inhomogeneous K-function (Adelfio et al.,, 2020) documented by the function KLISTA of the stpp package (Gabriel et al.,, 2013). It returns the points identified as outlying following the diagnostics procedure on individual points of an observed point pattern, as introduced in Adelfio et al., (2020). The points resulting from the local diagnostic procedure provided by this function can be inspected via the plot, print, summary, and infl functions. localdiag is also able to perform local diagnostics of a model fitted for the first-order intensity of an spatio-temporal point pattern on a linear network, by means of the local spatio-temporal inhomogeneous K-function on linear networks D’Angelo et al., (2021) documented by the function localSTLKinhom. It returns the points identified as outlying following the diagnostics procedure on individual points of an observed point pattern, as introduced in Adelfio et al., (2020), and applied in D’Angelo et al., 2022b for the linear network case.

set.seed(12345) stlp1 <- rETASlp(cat = NULL, params = c(0.078915 / 2, 0.003696, 0.013362, 1.2, 0.424466, 1.164793), betacov = 0.5, m0 = 2.5, b = 1.0789, tmin = 0, t.lag = 200, xmin = 600, xmax = 2200, ymin = 4000, ymax = 5300, iprint = TRUE, covdiag = FALSE, covsim = FALSE, L = chicagonet)

res <- localdiag(stlp1, intensity = density(as.stlpp(stlp1), at = "points")) res

Points outlying from the 0.95 percentile of the anaysed spatio-temporal point pattern on a linear network

Analysed pattern X: 65 points 4 outlying points

plot(res) infl(res)

8 Conclusions

This work has introduced the stopp R package, which deals with spatio-temporal point processes occurring either on the Euclidean space or on some specific linear networks, such as streets of a city.

The package includes functions for summarizing, plotting, and performing various analyses on point processes; these functions mostly use the approaches suggested in a few recent works in scientific literature. Modelling, statistical inference, and simulation difficulties on spatio-temporal point processes on Euclidean space and linear networks, with a focus on their local properties, are the core topics of such research and the package in turn.

To start with, we set the notation for spatio-temporal point processes that can occur in both linear networks and Euclidean spaces. After that, we went over the main methods implemented in the stopp package for dealing with simulations, data, and objects in point processes. After having recalled the definition of Local Indicators of Spatio-Temporal Association (LISTA) functions, we have moved to introduce the new functions that compute the LISTAs on linear networks. We then illustrated functions to run a local test to evaluate the local differences between two point patterns occurring on the same metric space. Moreover, many examples of the models included in the package are provided. These examples include: models for separable Poisson processes on both Euclidean space and networks, global and local non-separable inhomogeneous Poisson processes, and LGCPs. Then, techniques for performing both global and local diagnostics on such models (but not limited to those only) for point patterns on linear networks and planar spaces are provided.

The package tools are not exhaustive. This work represents the creation of a toolbox for different kinds of spatio-temporal analyses to be performed on observed point patterns, following the growing stream of literature on point process theory. The presented work contributes to the existing literature by framing many of the most widespread methods for the analysis of spatio-temporal point processes into a unique package, which is intended to foster many further extensions.

Bibliography56

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Adelfio and Chiodi, (2020) Adelfio, G. and Chiodi, M. (2020). Including covariates in a space-time point process with application to seismicity. Statistical Methods & Applications , pages 1–25.
2Adelfio and Schoenberg, (2009) Adelfio, G. and Schoenberg, F. P. (2009). Point process diagnostics based on weighted second-order statistics and their asymptotic properties. Annals of the Institute of Statistical Mathematics , 61(4):929–948.
3Adelfio et al., (2020) Adelfio, G., Siino, M., Mateu, J., and Rodríguez-Cortés, F. J. (2020). Some properties of local weighted second-order statistics for spatio-temporal point processes. Stochastic Environmental Research and Risk Assessment , 34(1):149–168.
4Adepeju, (2022) Adepeju, M. (2022). stpp Sim: Spatiotemporal Point Patterns Simulation . R package version 1.2.7.
5Ang et al., (2012) Ang, Q. W., Baddeley, A., and Nair, G. (2012). Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scandinavian Journal of Statistics , 39(4):591–617.
6Anselin, (1995) Anselin, L. (1995). Local indicators of spatial association-lisa. Geographical analysis , 27(2):93–115.
7Baddeley, (2017) Baddeley, A. (2017). Local composite likelihood for spatial point processes. Spatial Statistics , 22:261–295.
8Baddeley et al., (2006) Baddeley, A., Bárány, I., and Schneider, R. (2006). Stochastic Geometry: Lectures Given at the CIME Summer School Held in Martina Franca, Italy, September 13-18, 2004 . Springer.