Semiparametric estimation for space-time max-stable processes: F -madogram-based estimation approach
Abdul-Fattah Abu-Awwad (ICJ, PSPM), V\'eronique Maume-Deschamps (ICJ,, PSPM), Pierre Ribereau (PSPM, ICJ)

TL;DR
This paper introduces a semiparametric estimation method based on the F-madogram for modeling extremal dependence in space-time max-stable processes, with applications to rainfall data analysis.
Contribution
It proposes a novel F-madogram-based inference approach for space-time max-stable processes, addressing complex spatio-temporal dependence modeling.
Findings
Method performs well in simulation studies.
Successfully applied to real rainfall data.
Provides reliable extremal dependence estimates.
Abstract
Max-stable processes have been expanded to quantify extremal dependence in spatio-temporal data. Due to the interaction between space and time, spatio-temporal data are often complex to analyze. So, characterizing these dependencies is one of the crucial challenges in this field of statistics. This paper suggests a semiparametric inference methodology based on the spatio-temporal F-madogram for estimating the parameters of a space-time max-stable process using gridded data. The performance of the method is investigated through various simulation studies. Finally, we apply our inferential procedure to quantify the extremal behavior of radar rainfall data in a region in the State of Florida.
| Scheme 1 | Scheme 1, [7] | Scheme 2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| True | Mean estimate | RMSE | MAE | Mean estimate | RMSE | MAE | Mean estimate | RMSE | MAE | ||
| Purely Spatial | |||||||||||
| 0.3998 | 0.0191 | 0.0162 | 0.4033 | 0.0678 | 0.0559 | 0.4093 | 0.0389 | 0.0307 | |||
| 1.5019 | 0.0289 | 0.0243 | 1.4984 | 0.0521 | 0.0400 | 1.4921 | 0.1399 | 0.1083 | |||
| Purely temporal | |||||||||||
| 0.1944 | 0.0314 | 0.0246 | 0.2249 | 0.0649 | 0.0526 | 0.1909 | 0.0251 | 0.0201 | |||
| 0.9969 | 0.0831 | 0.0657 | 0.9563 | 0.0939 | 0.0767 | 1.0278 | 0.0785 | 0.0619 | |||
| Scheme 1 | Scheme 2 | ||||||
|---|---|---|---|---|---|---|---|
| True | Mean estimate | RMSE | MAE | Mean estimate | RMSE | MAE | |
| Purely Spatial | |||||||
| 0.9973 | 0.0331 | 0.0259 | 0.9929 | 0.0888 | 0.0727 | ||
| 0.0081 | 0.0470 | 0.0369 | 0.0357 | 0.0770 | 0.0609 | ||
| 0.9848 | 0.0440 | 0.0346 | 1.0295 | 0.0805 | 0.0647 | ||
| Purely temporal | |||||||
| 1.0021 | 0.0549 | 0.0426 | 1.0261 | 0.0747 | 0.0591 | ||
| 1.0107 | 0.0646 | 0.0505 | 0.9962 | 0.0620 | 0.0516 | ||
| 0.7012 | 0.0595 | 0.0482 | 0.6939 | 0.0510 | 0.0400 | ||
| Scheme 1 | Scheme 2 | ||||||
| True | Mean estimate | RMSE | MAE | Mean estimate | RMSE | MAE | |
| Purely Spatial | |||||||
| 1.9841 | 0.0368 | 0.0309 | 2.0357 | 0.0812 | 0.0599 | ||
| 1.4967 | 0.0407 | 0.0327 | 1.4814 | 0.0771 | 0.0557 | ||
| Purely temporal | |||||||
| 0.9852 | 0.0442 | 0.0346 | 1.0036 | 0.0556 | 0.0393 | ||
| 0.0177 | 0.0636 | 0.0512 | 0.0053 | 0.0427 | 0.0353 | ||
| 0.3031 | 0.0473 | 0.0383 | 0.2913 | 0.0393 | 0.0318 | ||
| Distance | Number of spatio-temporal pairs of points |
|---|---|
| Model | Purely spatial parameters | Purely temporal parameters | |||
|---|---|---|---|---|---|
| A1 | |||||
| A2 | 43.0257 | ||||
| B1 | , | ||||
| B2 | , | , | 21.4420 | ||
| B3 | , | , | |||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrology and Drought Analysis · Soil Geostatistics and Mapping · Financial Risk and Volatility Modeling
\floatsetup
[table]capposition=top
Semiparametric estimation for space-time max-stable processes: -madogram-based estimation approach
Abdul-Fattah Abu-Awwad, [email protected] &
Véronique Maume-Deschamps, [email protected] &
Pierre Ribereau, [email protected]
Université de Lyon, Université Claude Bernard Lyon 1,
Institut Camille Jordan ICJ UMR 5208 CNRS, France
Abstract
Max-stable processes have been expanded to quantify extremal dependence in spatio-temporal data. Due to the interaction between space and time, spatio-temporal data are often complex to analyze. So, characterizing these dependencies is one of the crucial challenges in this field of statistics. This paper suggests a semiparametric inference methodology based on the spatio-temporal -madogram for estimating the parameters of a space-time max-stable process using gridded data. The performance of the method is investigated through various simulation studies. Finally, we apply our inferential procedure to quantify the extremal behavior of radar rainfall data in a region in the State of Florida.
1 Introduction
Typically, extremes of environmental and climate processes like extreme wind speeds or heavy precipitation are modelled using extreme value theory. Max-stable processes are ideally suited for the statistical modeling of spatial extremes as they form the natural extension of multivariate extreme value distributions to infinite dimensions. Various families of max-stable models and estimation procedures have been proposed for extremal data. For a detailed overview of max-stable processes, we refer the reader to [21]. For statistical inference, it is then often assumed that the observations at spatial locations are independent in time, see, e.g., [29, 19, 18]. However, many extreme environmental processes observations exhibit a spatial dependence structure, meaning that neighboring locations within some distance show similar patterns, as well as a temporal dependence, which can be seen from high values for two consecutive time moments (e.g., within hours). As an illustration, Figure 1 depicts the daily rainfall maxima for the wet seasons (June-September) from the years 2007-2012 at one fixed grid location in Florida. We observe that it is likely that a high value is followed by a value of a similar magnitude. So, the temporal dependence may be present. Accordingly, the temporal dependence structure should be considered in an appropriate way. More details on the rainfall data in Florida are given in Section 5.
Currently space-time models are still taking up little space in the literature. Only a few papers have introduced space-time max-stable models. For instance, [13] extended the construction of spatial Brown-Resnick (BR) model [6, 26] and Smith’s storm profile model [34] to the space-time domain, whereas [8] extended the space-time BR model [13] to an anisotropic setting. Additionally, [25] introduced an extension of spatial Schather model [31], which comprises a truncated Gaussian random process, so that storm shapes are stochastic, and includes a compact random set, that allows the process to be mixing in space as well as to exhibit a spatial diffusions, see also [17]. A common feature for these models is that the major emphasis is in modeling asymptotic dependence treating the time just as additional dimension of the space. So, these models do not allow any interaction between the spatial components and temporal component in the underlying dependence function. However it seems reasonable to suppose that the spatial and temporal components behave asymptotically in a different way. Therefore, a new class of space-time max-stable models have been proposed by [23] in which the influence of time and space are partly decoupled, where the time infuences space through a bijective operator on space.
The inference on max-stable processes in both spatial and spatiotemporal contexts is an open field that is still in development. Many techniques have been proposed for parameter estimation in spatial extreme models. Each technique has its pros and cons. As with spatial max-stable processes, the pairwise likelihood estimation has been found useful to estimate the parameters of space-time max-stable processes due to its theoretical properties, see, e.g. [14, 25, 23]. Recently, various semiparametric estimation approaches have been proposed to fit such processes. For instance [7] introduced a new semiparametric estimation procedure based on a closed form expression of the so-called extremogram [15] to estimate the parameters of space-time max-stable BR process. The extremogram has been estimated nonparametrically by its empirical version, where space and time are separated. A constrained weighted linear regression is then applied in order to produce parameter estimates. While in [1] a semiparametric estimation procedure has been developed for spatial max-mixture processes [35] based on the -madogram [12]. A non-linear least squares (NLS) is then applied to minimize the squared difference between the empirical -madogram and its model-based counterpart. A major advantage of the semiparametric methods is the substantial reduction of computation time compared to the pairwise likelihood estimation. Hence, these methods can be applied as an alternative or a prerequisite to the widely-adopted pairwise likelihood inference, which suffers from some defects; first, it can be onerous, since the computation and subsequent optimization of the objective function is time-consuming. Second, the choice of good initial values for optimization of the composite likelihood is essential.
An implicit difficulty in any extreme value analysis is the limited amount of data for model estimation, see, e.g. [10]. Hence, inference based on the extremogram is difficult because few observations are available as the threshold increases. Consequently, the semiparametric estimates obtained by [7] showed a larger bias than the pairwise likelihood estimates and are sensible to the choice of the threshold used for the extremogram. Accordingly, the surrogates of existing estimation techniques should be welcomed.
In the present paper, we are interested in statistical inference for space-time max-stable processes. Motivated by deficiencies in existing inference approaches, we propose two novel and flexible semiparametric estimation schemes to fit space-time max-stable processes:
- (i)
Scheme 1: we estimate spatial and temporal parameters separately. Based on NLS, we minimize the squared difference between the empirical estimates of spatial/temporal -madograms and their model-based counterparts. Our inferential methodology is close to the one that has been proposed by [7] as an alternative or a preliminary analysis to the pairwise likelihood approach in [14], where only isotropic space-time max-stable BR process has been fitted via the two approaches. 2. (ii)
Scheme 2: we generalize the NLS to estimate spatial and temporal parameters simultaneously.
The remainder of the paper is organized as follows. Section 2 defines the space-time max-stable models. The two semiparametric estimation schemes are described in Section 3. Section 4 illustrates the performance of our method through various simulation studies, where also a comparison with the semiparametric estimation [7] is performed. In Section 5, we apply our method to radar rainfall data in a region in Florida by using spatial and temporal block maxima design. The concluding remarks in Section 6 address some remaining issues and perspectives.
2 Space-time max-stable models
Throughout the paper, , (generally, ) is a spatiotemporal process, where the space is the spatiotemporal domain. The points denote the spatial coordinates and are called “sites” or “locations” or “stations” and the points denote the temporal coordinates and are called “times” or “moments”. The space index and time index will respectively belong to the sets and . In addition, we will denote by (respectively ) the spatial (respectively temporal) lag.
2.1 Space-time max-stable models without spectral separability
According to [20], the simple space-time max-stable process , where simple means that the margins are standard Fréchet, i.e., , , has the following spectral representation
[TABLE]
where denotes the max-operator, are independent and identically distributed (i.i.d.) points of a Poisson process on with intensity and is a sequence of independent replications of some space-time process with for each , and , which are also independent of .
For , , and , the finite -dimensional distributions of the space-time max-stable process are given by
[TABLE]
Hence, all finite-dimensional distributions are multivariate extreme value distributions with unit Fréchet margins. In particular, for , the bivariate cumulative distribution function (c.d.f.) of the space-time max-stable process in (2.1) can be expressed in terms of the underlying bivariate spatio-temporal exponent function as
[TABLE]
Below, we will consider stationary space-time processes, so that depends only on and . We will write for and for .
2.1.1 Spatio-temporal extremal dependence summary measures
In order to measure the spatio-temporal extremal dependence, we provide in the next Definition, extensions to the spatio-temporal setting of some quantities that have been introduced in the spatial context. For a stationary spatio-temporal max-stable process with univariate margin c.d.f. , we have
- (i)
**(Spatio-temporal extremal dependence function, originally due to [33]) **
[TABLE] 2. (ii)
(Spatio-temporal upper tail dependence function, originally due to [11])
[TABLE]
and , . Similarly to spatial setting, we have the simple link: . 3. (iii)
(Spatio-temporal -madogram, originally due to [12])
[TABLE]
Furthermore, the -madogram (originally due to [5]) and -madogram (originally due to [28]) are defined analogously. 4. (iv)
(Spatio-temporal extremogram dependence function, originally due to [15])
[TABLE]
Clearly, setting the Borel sets yields . The two cases and correspond to the boundary cases of asymptotic independence and complete dependence.
Both dependence functions and provide simple measures of extremal dependence within the class of asymptotic dependence distributions.
Example 2.1**.**
(Stationary BR spatio-temporal process without spectral separability) [13] introduced the spatial BR model [6, 26] in space and time. A strictly stationary spatio-temporal BR process has the following spectral representation
[TABLE]
where are points of a Poisson process on with intensity , the processes are independent replications of a Gaussian process with stationary increments, , and covariance function
[TABLE]
for all . The dependence function which is termed the spatio-temporal semivariogram of the process , is non-negative and conditionally negative definite, that is, for any , and ,
[TABLE]
The process in (2.8) is fully characterized by the dependence function . In geostatistics, the function is given by
[TABLE]
Let denote the standard normal distribution function. For , the bivariate c.d.f. () of in the stationary case is given by
[TABLE]
Recall that if is assumed to depend only on the norm of , the associated process is spatially isotropic. The pairwise spatio-temporal extremal dependence function for this model is . This model has been used in [7] to quantify the extremal behavior of radar rainfall data in a region of Florida, where a new semiparametric procedure based on the extremogram is applied to estimate the model parameters.
2.2 Space-time max-stable models with spectral separability
The fundamental advantages of the spectral representation in (2.1) are (i) the construction of spatio-temporal processes from widely studied max-stable processes (ii) the huge literature available on spatio-temporal correlation functions for Gaussian processes, allows for considerable diversity of spatio-temporal behavior. However, an important modeling issue is that they do not allow any interaction between the spatial and the temporal components in the underlying dependence function. Thus, the time has no specific role but is equivalent to an additional spatial dimension; the spatial and temporal distributions belong to a similar family of models. Hence, alternatively, a new class of space-time max-stable models with spectral separability has been suggested in [23]. More precisely,
[TABLE]
where are the points of a Poisson process on , and with intensity for some Polish measure spaces and . The spectral function is measurable such that for each and contributes to the temporal dynamic of the process, whereas the spectral function is measurable such that for each and drives the shape of the main spatial patterns. The operators are bijective from to for each and describes how the spatial patterns move in space.
The construction (2.10) allows one to deal with the temporal and spatial aspects separately. So, the estimation procedure can be simplified by estimating in a first step the spatial parameters independently from the temporal ones. Several examples of subclasses of the general class of space-time process (2.10) were introduced by [23], where the operator is either a translation or a rotation. The authors in that paper focused mainly on a special case of models where the function corresponding to the time in the spectral representation is the exponential density (continuous-time case) or the probability values of a geometric random variable (discrete-time case). So, the corresponding models become Markovian and have a useful max-autoregressive representation, i.e.,
[TABLE]
where the parameter measures the influence of the past, the parameter represents some kind of specific direction of propagation/contagion in space and is a time-independent process and is derived from independent replications of a spatial max-stable process . This model can be seen as an extension of the real-valued max-autoregressive moving-average process MARMA(1,0) to the spatial context, see [16]. The value at location and time is either related to the value at location at time or to the value of another process (the innovation), , that characterizes a new event happening at location . This model may be useful for phenomena that propagate in space.
In the following, we will focus on the processes satisfying (2.11). Let denote the exponent function characterizing the spatial distribution of the process , then the bivariate c.d.f. of can be expressed for as
[TABLE]
Moreover, the spatio-temporal extremal dependence function in (2.4) can be easily deduced in this case by setting in (2.12),
[TABLE]
Clearly, space and time are not fully separated in the extremal dependence function, even if (space and time are completely separated in the spectral representation). Asymptotic time independence is achieved when . In the sequel, we give two examples of a bivariate space-time max-stable process satisfying (2.11).
- (i)
Spectrally separable space-time max-stable Smith process
If the innovation process is derived from independent replications of a spatial Smith process [34] with a covariance matrix . Then the bivariate c.d.f. of the resulting spatio-temporal model in (2.11) has the form
[TABLE]
where . The associated spatio-temporal extremal dependence function with this model is
[TABLE] 2. (ii)
Spectrally separable space-time max-stable Schlather process
The spatio-temporal model in (2.11) with an innovation process derived from independent replications of a spatial Schlather process [31], has a bivariate c.d.f. of the form
[TABLE]
where is the spatio-temporal exponential correlation function related to this model. The associated spatio-temporal extremal coefficient with this model is
If the time lag , the formulas in (2.14) and (2.16) reduce to the bivariate distributions of the max-stable spatial fields.
3 Statistical inference for space-time max-stable processes
In what follows, we shall denote, respectively, by and the Euclidean norm of spatial lag and the absolute value of temporal lag .
We now describe two semiparametric estimation schemes for space-time max-stable processes based on the spatio-temporal -madogram in (2.6), which stems from a classical geostatistical tool; the madogram [27]. It has a clear link with extreme value theory throughout the spatio-temporal extremal dependence function , i.e.,
[TABLE]
In practice, measurements are typically taken at various locations, sometimes on a grid, and at regularly spaced time intervals. In the following, the process is assumed to be a stationary space-time max-stable process. It is observed on locations assumed to lie on a regular 2-dimensional (2D) grid, i.e.,
[TABLE]
and at equidistant time moments, given by . This sampling scheme has been adopted in various studies in the literature, see e.g., [14, 8, 7]. For statistical inference on the process , we develop the following two semiparametric estimation schemes.
3.1 Scheme 1
Let denotes the vector gathering the parameters of the process to be estimated, where and denote, respectively, the vectors gathering the spatial and temporal parameters. In this scheme, we consider how the process evolves at given time of reference (a merely spatial process), and its evolution over time at a given location (a merely temporal process). So, and can be estimated separately. More precisely, denote by and finite sets of spatial and temporal lags on which the estimation is performed. Let the set summarizes all pairs of which give rise to the same spatial lag , i.e.,
[TABLE]
The inferential methodology is summarized in the following steps:
- (i)
As a first step, we estimate the purely spatial/temporal -madogram nonparametrically by the empirical version. Denote by the nonparametric estimate of the purely spatial (respectively temporal) -madogram. As is standard in geostatistics, we compute from the empirical spatio-temporal -madogram at spatio-temporal distances , that is for all ,
[TABLE]
where denotes the cardinality of the set and is the standard Fréchet probability distribution function. Let us remark that, a similar estimator in the framework of -madogram has been adopted by [28] in an analysis of Bourgogne (France) annual maxima of daily rainfall measurements. On the other hand, is computed from the empirical spatio-temporal -madogram at spatio-temporal distances , that is for all
[TABLE] 2. (ii)
Then, the overall purely spatial (respectively temporal) -madogram estimates (respectively ) are computed from the means over the temporal moments (respectively the spatial locations). More precisely,
[TABLE]
[TABLE] 3. (iii)
Finally, a NLS procedure is applied to estimate the parameters of interest.
[TABLE]
[TABLE]
where and denote, respectively, the spatial and temporal model-based -madogram counterparts. and denote, respectively, the spatial and temporal weights. Since it is expected that the spatio-temporal pairs which are far away in space or in time, have only little influence on the dependence parameters to be estimated, a simple choice for these weights is , , where denotes the indicator function and is fixed.
Note that the setup of the inferential methodology in Scheme 1 is close to the one proposed in [7], in which the spatio-temporal extremogram in (2.7) was adopted.
3.2 Scheme 2
We now generalize Scheme 1 in order to estimate temporal and spatial parameters simultaneously. Thus, we consider how the process evolves in both space and time. In the classical geostatistics, for a stationary spatio-temporal process , the spatio-temporal empirical classical semivariogram is defined by
[TABLE]
where , see e.g., [24]. By adapting this estimator to our framework, we consider the following estimation procedure:
- (i)
First, the spatio-temporal -madogram is estimated nonparametrically by its empirical version. Assume the set summarizes all pairs of which give rise to the same spatial lag and the same temporal lag . In other words, combining the spatial and the temporal lags from Scheme 1, i.e.,
[TABLE]
We estimate by
[TABLE]
where denotes the cardinality of the set and . 2. (ii)
Then, we apply a NLS fitting to obtain the estimates of the process parameters; , i.e.,
[TABLE]
where denotes the spatio-temporal weights and is the model-based spatio-temporal -madogram.
The idea underlying the construction of Scheme 2 is that when modeling and predicting a given phenomenon, significant benefits may be obtained by considering how it evolves in both space and time rather than only considering its spatial distribution at a given time of reference (a merely spatial process), or its evolution over time at a given location (a merely temporal process), such as those described in Scheme 1. Lastly, the establishment of the asymptotic properties of the resulting pairwise dependence estimates is deferred to future work. The derived asymptotic properties of the unbinned empirical -madogram in the spatial context, see [28] (Proposition 3 and 4) might provide a starting point. Nevertheless, this setting is more specialized. In the real data example of that study, a binned version of the empirical -madogram is adopted and deriving the convergence of this estimator as the cardinality of the distance class (i.e., ) increases is still challenging. Therefore, we will provide some numerical indications for the asymptotic properties of our pairwise dependence estimates.
3.3 Illustration examples
In order to illustrate how the proposed estimation schemes perform, we consider the following two examples, which we will revisit in Section 4.
Example 3.1**.**
(Estimation of isotropic space-time max-stable BR) Let us consider the space-time max-stable BR process in (2.8) with bivariate c.d.f. (2.9), where the dependence structure is given by the following stationary isotropic fractional Brownian motion (FBM) spatio-temporal semivariogram
[TABLE]
where the scalar distance , , determine spatial and temporal scale parameters and relate to the smoothness of the underlying Gaussian process in space and time. The associated spatio-temporal -madogram with this process is
[TABLE]
where is the associated spatio-temporal extremal dependence function. Figure 2 visualizes a 3D representation of the spatio-temporal FBM semivariogram in (3.8) and the associated dependence summary measures: the spatio-temporal extremal dependence function and the spatio-temporal -madogram . Complete dependence (respectively complete independence) is achieved at lower boundaries (respectively upper boundaries). Moreover, Figure 3 displays the theoretical behaviors of the purely spatial FBM semivariogram and the related purely spatial -madogram . Obviously, depending on the value of the smoothness parameter , these measures exhibit a large variety of dependence behaviors.
With this construction, based on Scheme 1, the NLS optimization problems in (3.4) and (3.5) can be expressed as
[TABLE]
[TABLE]
Lastly, with and on the basis of Scheme 2, the NLS estimation problem in (3.7) has the form
[TABLE]
Example 3.2**.**
(Estimation of spectrally separable space-time max-stable Smith process) We now describe the way to fit the spectrally separable space-time max-stable Smith process. Indeed, the estimation procedure can be simplified since the purely spatial parameters can be estimated independently of the purely temporal parameters. Formally, we consider the process in (2.11), where the innovation process is derived from independent replications of a spatial Smith process with covariance matrix
[TABLE]
We donte by the vector gathering the parameters to be estimated, i.e., . It is possible to separate the estimation. Firstly, the estimation of the spatial parameters is carried out. Secondly, once is known, it is held fixed and we estimate the temporal parameters . Subsequently, under Scheme 1, the NLS optimization problems in (3.4) and (3.5) can be expressed as
[TABLE]
[TABLE]
where,
[TABLE]
with
In order to figure out the role of the temporal parameter for this process. For a fixed site , Figure 4 displays the temporal extremal function and the associated temporal -madogram for . We set 10 Id2 and (translation to the top right). Clearly, as the value of increases, the independece (i.e., ) occurs at larger time lags .
Lastly, based on Scheme 2, the NLS estimator is given by
[TABLE]
where
4 Simulation study
Throughout this section, we investigate the performance of the semiparametric estimation procedures introduced in Section 3 with three simulation studies.
4.1 Simulation study 1: Fitting space-time max-stable BR process
In this study, we adopt the same experiment plan that has been proposed in [7] (Section 5), in order to make the results obtained there comparable with the results here.
4.1.1 Setup for a simulation study
We simualte the space-time BR process with spectral representation (2.8) and dependence function modeled as in (3.8). Namely,
[TABLE]
The simulations have been carried out using the function RFsimulate of the R package RandomFields [32] and based on the exact method proposed by [22]. The space-time observation area is assumed to be on a spatial grid and the time moments are equidistantly, i.e.,
[TABLE]
Figure 5 visualizes a realization simulated from space-time BR process with a spatio-temporal FBM semivariogram model (4.1) at six consecutive time points.
As in [7], we choose the sets and , where permutation tests show that these lags are enough to capture the relevant extremal dependence structure, see Figure 6. Equal weights are assumed. We repeat this experiment 100 times to obtain summary plots of the resulting estimates and to compute performance metrics: the mean estimate, the root mean squared error (RMSE) and the mean absolute error (MAE).
4.1.2 Estimation using Scheme 1
Simulation of space-time max-stable BR processes based on the exact method proposed in [22] can be time-consuming. Hence, for the sake of time-saving and due to the fact that the estimation of the purely spatial (respectively purely temporal) parameters depends on a large number of spatial observations (respectively a large number of observed time instants), we examine the performance of the purely spatial (respectively purely temporal) estimates using two different space-time observation areas, i.e.,
- •
- •
We assess the quality of the fit between the theoretical values of spatial/temporal -madograms and their estimates. Figure 7 compares empirical estimates of purely spatial/temporal -madograms with their asymptotic counterparts. Overall, both the purely spatial/temporal empirical versions are consistent, with a relatively higher variability for the temporal estimates. This is probably due to the fairly low number of time instants (300) used for the estimation of the purely temporal parameters compared to the number of spatial locations (2500) used for the estimation of the purely spatial parameters.
Next, we present results for the semiparametric estimation with Scheme1. Figure 8 displays the resulting estimates of the purely spatial parameters and the purely temporal parameters . Generally, the estimation procedure appears to work well. Moreover, we observe that the estimation of the purely spatial parameters is more accurate (the RMSE and MAE are lower), see Table 1. Again this probably stems from the large number of spatial locations used in the estimation which is () times higher than the time points.
As the last step in this study, we compare the statistical efficiency of our method and the one proposed in [7]. Table 1 reports the performance metrics for both methods. Although in that study, the authors used a larger grid size to estimate the purely spatial parameters, clearly, the -madogram semiparametric estimation outperforms their approach which based on the extremogram as an inferential tool (their semiparametric estimates show a larger bias than ours; the RMSE and MAE are higher). This is probably due to the fact that the estimates obtained in [7] are sensitive to the choice of the threshold used for computing (possibly bias corrected) empirical estimates of the extremogram.
4.1.3 Estimation using Scheme 2
Based on Scheme 2, we estimate the parameters of the space-time max-stable BR process with a similar simulation setting which is previously described in Section 4.1.1. We consider the space-time observation area where the spatial locations consisted of a grid and equidistantly time points, . Figure 9 compares the empirical spatio-temporal -madogram estimates with their model-based counterparts over the spatio-temporal lags . There is a good agreement overall. These diagnostic plots provide a satisfactory representation of the empirical spatio-temporal -madogram estimates. Generally, the results lend support to the agreement between the empirical spatio-temporal -madogram estimates and model-based counterparts, especially once sampling variability is taken into account.
Figure 10 shows the estimation performance of the estimated parameters. Overall, the parameters are well estimated. Moreover, we observe that the estimation of the scale parameters , is more accurate than the smoothness parameters (the RMSE and MAE are lower), see Table 1.
To sum up, for both schemes, Table 1 reports the mean estimate, RMSE, and MAE of the estimated parameters . Let us remark that the comparison between the resulting parameter estimates from the two estimation schemes is not completely straightforward because we consider non-unified space-time observation areas due to the above-mentioned computational reasons. However, with the above sampling schemes, we observe that the estimation of the purely spatial parameters is more accurate when using Scheme 1 (the RMSE and MAE are lower). On the other hand, we notice a slight outperformance for Scheme 2 in estimating purely temporal parameters. Finally, the QQ-plots against a normal distribution in Figure 11 provide an indication for asymptotic normality of the resulting estimates.
4.2 Simulation study 2: Fitting spectrally separable space-time max-stable Smith process
4.2.1 Setup for a simulation study
We simulate data from the spatio-temporal Smith process considered in Example 3.2, with parameter vector . As a reasonable compromise between accuracy and computation time, the locations are assumed to lie on a regular 2D grid of size . The time points are equidistant, given by the set . The simulations have been carried out using R SpatialExtremes package with rmaxstab function, see [30]. The spatial lags set and temporal lags set are fixed as before, recall Section 4.1. Equal weights are assumed. We repeat this experiment 100 times.
4.2.2 Results for the two estimation schemes
The top row of Figure 12 displays the density of the errors between the empirical estimates of the purely spatial/temporal -madograms and their model-based counterparts, whereas the bottom row displays the density of the errors between empirical spatio-temporal -madogram estimates and model-based counterparts. Generally, all of the empirical versions are congruous with their asymptotic counterparts. Clearly, the density of the errors is close to a centered Gaussian distribution.
Figure 13 displays boxplots the errors of the resulting estimates from both schemes: (). The top row displays the estimation errors of purely spatial parameters and purely temporal parameters resulting from Scheme 1, whereas the bottom row displays the estimation errors resulting form Scheme 2. Overall, the inference procedures perform well. Altogether, we observe that the estimates are close to the true values.
To sum up, for both schemes, Table 2 reports the mean estimate, RMSE, and MAE of the estimated parameters . Contrary to Scheme 2, we observe that the estimation of purely spatial parameters is more accurate than the estimation of purely temporal parameters ( and ) when using Scheme 1 (RMSE and MAE are lower). This probably can be justified by the fact that in Scheme 1 the number of spatial locations used is higher than time moments. Additionally, there is probably an impact of the estimated covariance matrix on the estimation efficiency of the purely temporal parameters, whereas, the purely temporal parameters are estimated independently of purely spatial parameters when using Scheme 2. Moreover, we notice that the estimation of purely spatial parameters is less accurate when using Scheme 2 (RMSE and MAE are higher). This is probably owing to the fact that in Scheme 2 the number of pairs used is higher than in Scheme 1, leading more variability. Whereas, both schemes seem to have the same performance order in estimating purely temporal parameters.
We also show QQ-plots against a normal distribution for all parameters in Figure 14. For both schemes, it seems that the semiparametric estimates are approximately normally distributed.
Finally, let us remark that a simulation study has been carried out in [23], where only the spectrally separable spatio-temporal Smith process has been fitted. Irregularly spaced locations have been considered. Two estimation schemes based on pairwise likelihood have been adopted (a two-step approach and a one-step approach). The obtained results have shown that, the estimation of purely spatial parameters is more accurate with a two-step approach.
4.3 Simulation study 3: Fitting spectrally separable STMS Schlather process
Finally, we perform a third simulation study to fit spectrally separable space-time max-stable Schlather process. The innovation process is derived from independent replications of a spatial Schlather process with correlation function of powered exponential type defined, for all , by , and , where and denote, respectively, the range and the smoothing parameters. We denote by the vector gathering the model parameters. We take , , and . As previously, we consider the same simulation setup used in Section 4.2.1. The results are summarized in Figure 15 and Table 3. Generally, we obtain equally satisfying results.
5 Real data analysis
In this section, we aim to quantify the extremal behavior of radar rainfall data in a region in the State of Florida. Our approach is to fit the data by different space-time max-stable classes based on a space-time block maxima design using the proposed semiparametric estimation procedure.
5.1 Description of the dataset
The dataset analyzed in this section is composed of radar rainfall values (in inches) measured on a square of 140 140 km region containing 4900 grid locations in the State of Florida. The database consists of radar hourly rainfall values measured on a regular grid with squared cells of size 2 km covering a region of 70 70 cells in the State of Florida. A map of the study area is shown in Figure 16. We only consider the wet season (June-September) over the years 2007-2012. The data were collected by the Southwest Florida Water Management District (SWFWMD) and freely available on ftp://ftp.swfwmd.state.fl.us/pub/radar_rainfall. Moreover, the dataset is available in the Supplementary Material: http://math.univ-lyon1.fr/homes-www/abuawwad/Florida_RadarRainfall/.
5.2 Data fitting
We perform a block maxima design in space and time as follows: we take block maxima over 24 consecutive hours and over 10 km 10 km areas (the daily maxima over 25 grid locations), resulting in grid in space for all days of the wet seasons. So, this gives a time series of dimension and of length 732. For the sake of notational simplicity, we denote the set of resulting grid locations by and the spacetime realizations by . This setup has been also considered in [7, 14] for analyzing radar rainfall measurements in a region in the State of Florida over the years 1999-2004, where only space-time max-stable BR process has been fitted to the data by a semiparametric approach in [7] and a pairwise likelihood approach in [14]. Let us remark that both regions here and in the above-mentioned two studies are located in the central portion of Florida District, which would probably have the best square area of coverage. Having larger grid size will lead to some cells missing in the southwestern ‘corner’ due to the coastline. Figure 17 shows the obtained time series for daily maxima observations at four grid locations.
According to this sampling scheme of the process , there are spatio-temporal pairs of points at distance , that is,
[TABLE]
[TABLE]
[TABLE]
Analogously, there are spatio-temporal pairs of points at distance , and so forth. Generally, for a set of spatio-temporal data measured in the time moments , on a regular spatial grid, we have spatio-temporal pairs of points at distance . Computing the -madogram values corresponding to the above spatio-temporal distances, we obtain the purely temporal empirical -madogram. It is also easy to check that there are spatio-temporal pairs of points at distance , at distance , and so forth, see Table 4. Computing the -madogram values for the spatio-temporal distances , we obtain the purely spatial empirical -madogram.
Since we are interested in modeling the joint occurrence of extremes over a region, then the dependence structure of a multivariate variable has to be explicitly stated. The usual modeling strategy consists of two steps: firstly, estimating the marginal distribution. Secondly, characterizing the dependence via a model issued by the multivariate extreme value theory, see e.g., [4, 29]. For marginal modeling, we explain the procedure as follows:
- (i)
We transform the data to stationarity by removing possible seasonal effects using a simple moving average with a period of 122 days (the number of days in the wet season considered in one particular year). More precisely, for each fixed location , we deseasonalize the time series by computing for
[TABLE] 2. (ii)
For each fixed location , the deseasonalized observations are fitted to the generalized extreme value distribution,
[TABLE]
for some location , scale , and shape . Let us remark that the estimated shape parameters are sufficiently close to zero with confidence interval containing zero, see Figure 18. This suggests a Gumbel distribution (GEV with ) as appropriate model. Therefore, we fit directly a Gumbel distribution
[TABLE]
For each spatial location, we assess the goodness of the marginal fits by QQ-plots of deseasonalized rain series versus the fitted Gumbel distribution. The results at four spatial locations are summarized in Figure 19. All plots provide a reasonable fit. 3. (iii)
The deseasonalized observations may be transformed either to standard Gumbel or standard Fréchet margins. More precisely, let , are the parameter estimates obtained from (ii), then we may use:
- (a)
to transform the deseasonalized observations to standard Gumbel margins; 2. (b)
to transform the deseasonalized observations to standard Fréchet margins. This transformation is called the probability integral transformation. In this study, we adopt this case.
In [7, 14], the authors assume that the observations are realizations from the space-time max-stable BR process. The contribution of the present section is to broaden the dependence structure by considering the spectrally separable space-time max-stable processes, that allow interactions between spatial and temporal components.
In the sequel, we estimate the extremal dependence structure for the daily maxima of rainfall measurements. Based on our findings in the simulation studies, we notice that Scheme 1 outperforms Scheme 2 generally. So, one may first estimate the extremal dependence parameters using Scheme 2. Afterward, re-estimating the parameters using Scheme 1, where the estimates resulting from Scheme 2 serve as starting values for the optimization routine used in Scheme 1. To that aim, we consider the following five spatio-temporal max-stable models:
- (i)
Class A: consists of two non-spectrally separable models A1 and A2.
- •
A1: a space-time max-stable BR model (2.8), with dependence function , where , recall Example 3.1.
- •
A2: a space-time max-stable Schlather model. The space-time correlation function is chosen to be separable such that
[TABLE]
where the range parameters and the smoothing parameters . 2. (ii)
Class B: consists of spectrally separable models B1, B2 and B3.
- •
B1: a spectrally separable space-time max-stable model (2.11), where the innovation process is derived from independent replications of a spatial BR process with semivariogram , for some range parameter and smoothness parameter . Obviously, models A1 and B1 are equivalent when the time lag .
- •
B2: a spectrally separable space-time max-stable model (2.11), where the innovation process is derived from independent replications of a spatial Smith process with covariance matrix recall Example 3.2.
- •
B3: a spectrally separable space-time max-stable model (2.11), where the innovation process is derived from independent replications of a spatial extremal- process with degrees of freedom and correlation function of type powered exponential defined, for all , by , and , where and denote, respectively, the range and the smoothing parameters.
To select the best-fitting model, we use the Akaike Information Criterion (AIC) which was first developed by [2] under the framework of maximum likelihood estimation. The AIC is one of the most widely used methods for selecting a best-fitting model from several competing models given a particular dataset. A concise formulation of the AIC under the framework of least squares estimation has been derived by [3]. The AIC under Scheme 1 is defined as
[TABLE]
where and are the estimated objective functions in space and time with , i.e.,
[TABLE]
[TABLE]
denotes the cardinality of the set , and and are respectively the total number of purely spatial and purely temporal parameters in the underlying model. If and , it is suggested to use an adjusted corrected version of (5.3), see [3], i.e.,
[TABLE]
Our results are summarized in Table 5. Model A1 has the lowest AIC value and therefore would be considered as the best candidate for this dataset, closely followed by model B3. Obviously, the temporal estimates ( and ) in the best-fitting model A1 indicate that there is a weak temporal extremal dependence. Recall that the purely temporal -madogram for this model is given by
[TABLE]
Accordingly, is close to zero for large values of , indicating asymptotic independence. On the other hand, is approximately constant when is small, indicating that the extremal dependence is the same for all . So, both large and small lead to temporal asymptotic independence.
For comparison, we present the semiparametric estimates obtained by [7]; . On the other hand, the pairwise likelihood estimates obtained by [14] are . Obviously, these estimates are close to our estimates, except the temporal smoothness estimate which is relatively large.
Figure 20 shows the empirical values of and , and their model-based counterparts from the three best-fitting models according to the AIC. It seems that the three models give a quite reasonable fit with a little outperformance for model A1. So, considering these plots and the AIC values there is overall evidence in favor of model A1.
Lastly, permutation tests can be useful to determine the range of clear dependence. So, in order to examine whether the extremal dependence in space and time is significant, we perform a permutation test. We randomly permute the space-time data and compute the empirical spatial/temporal -madograms. More precisely, to check how the extremal dependence lasts in space, for each fixed time point we permute the spatial locations. Afterward, the spatial -madogram is computed and the procedure is repeated 1000 times. From the obtained spatial -madogram sample, we compute 97.5% and 2.5% empirical quantiles which form a 95% confidence region for spatial extremal independence. On the other hand, to test the presence of temporal extremal independence, the analog procedure is done for the temporal -madogram. In particular, for each fixed location we sample without replacement from the corresponding time series and compute the empirical temporal -madogram. Our findings are shown in Figure 21 together with the fitted values of spatial/temporal -madograms derived from the best-fitting models A1. Inspecting these plots, it appears that the spatial extremal dependence vanishes for spatial lags larger than four (the fitted values for the spatial -madogram lies within the obtained independence confidence region), whereas the temporal extremal dependence vanishes for time lags larger than three. Let remark that the same conclusions are obtained in [7], where the permutation tests have been carried out based on the extremogram.
6 Concluding remarks
In summary, motivated by shortcomings in existing inferential methods, we proposed two novel and flexible semiparametric estimation schemes for space-time max-stable processes based on the spatio-temporal -madogram, . Working with the madogram has a few advantages. In addition to its simple definition and the computational facility, it has a clear link with extreme value theory throughout the extremal dependence function. The new estimation procedure may be considered as an alternative or a prerequisite to the widely used pairwise likelihood; the semiparametric estimates could serve as starting values for the optimization routine used to maximize the pairwise log-likelihood function to decrease the computational time and also improve the statistical efficiency, see [9].
A simulation study has shown that the inference procedure performs well. Moreover, our estimation methodology outperforms the semiparametric estimation procedure suggested by [7] which was based on the dependence measure extremogram. The introduced method is applied to radar rainfall measurements in a region in the State of Florida (Section 5) in order to quantify the extremal properties of the space-time observations.
Our attention is concentrated on fitting space-time max-stable processes based on gridded datasets. In the future, we plan to generalize our method in order to fit space-time max-stable processes with extensions to irregularly spaced locations that may have a fundamental interest in practice. In addition, equally weighted inference approaches have been widely implemented. However, using non-constant weights seems appealing for at least two reasons. First from a computational point of view, for example discarding distant pairs, the CPU load for the evaluation might be smaller and the fitting procedure would be less time-consuming. On the other hand, as neighboring pairs are expected to be strongly dependent, thus providing valuable information for the estimation of dependence parameters, this may improve the statistical efficiency. Therefore, it could be interesting to investigate the gain in statistical efficiency of estimators as well as computational efficiency by adopting different weighting strategies. Since the number of spatial and temporal lags are limited, we could consider weights such that locations and time points which are further apart from each other have less influence on the estimation, i.e.,
[TABLE]
where
Finally, it could be interesting to extend the spatial -madogram approach proposed by [28] to estimate the spatio-temporal extremal dependence function . For example, in the case of (2.11), it is easy to verify that for and , the spatio-temporal -madogram for any is given by
[TABLE]
where .
Acknowledgements
We acknowledge the Southwest Florida Water Management District (SWFWMD) for providing the data. Especially, we would like to thank Margit L. Crowell for the help in finding the data and the related details. We also thank the authors of paper [7] for providing their space-time max-stable BR process simulation R code, that used to simulate data in Section 4.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Ahmed, M., Maume-Deschamps, V., Ribereau, P. and Vial, C. (2017). A semi-parametric estimation for max-mixture spatial processes. submitted for publication- arxiv.org/abs/1710.08120.
- 2[2] Akaike, H. (1974). A new look at the statistical model identification. IEEE transactions on automatic control . 19 (6) 716–723.
- 3[3] Banks, H.T. and Joyner, M.L. (2017). AIC under the framework of least squares estimation. Applied Mathematics Letters . 74 33–45.
- 4[4] Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. L. (2004). Statistics of extremes: theory and applications . John Wiley & Sons.
- 5[5] Bel, L., Bacro, J-N. and Lantuéjoul, C. (2008). Assessing extremal dependence of environmental spatial fields. Environmetrics , 19 (2) 163–182.
- 6[6] Brown, B. M. and Resnick, S. I. (1977). Extreme values of independent stochastic processes. Journal of Applied Probability . 14 (4) 1732–739.
- 7[7] Buhl, S., Davis, R. A., Klüppelberg, C., and Steinkohl, C. (2016). Semiparametric estimation for isotropic max-stable space-time processes. submitted for publication- arxiv.org/abs/1609.04967 .
- 8[8] Buhl, S. and Klüppelberg, C. (2016). Anisotropic Brown-Resnick space-time processes: estimation and model assessment. Extremes . 19 (4) 627–660.
