Climate extreme event attribution using multivariate   peaks-over-thresholds modeling and counterfactual theory

Anna Kiriliouk; Philippe Naveau

arXiv:1908.03107·stat.AP·May 19, 2020

Climate extreme event attribution using multivariate peaks-over-thresholds modeling and counterfactual theory

Anna Kiriliouk, Philippe Naveau

PDF

TL;DR

This paper develops a statistical framework combining multivariate peaks-over-thresholds modeling and counterfactual causality theory to attribute climate extreme events, like heavy rainfall, to anthropogenic causes, especially in high-dimensional atmospheric data.

Contribution

It introduces a novel approach that integrates extreme-value theory with counterfactual causality for multivariate climate event attribution, including a dimension reduction strategy for high-dimensional data.

Findings

01

Effective modeling of joint extremes using multivariate generalized Pareto distribution.

02

Successful application to French winter precipitation data from CMIP6.

03

Enhanced causal attribution evidence through optimal linear projection.

Abstract

Numerical climate models are complex and combine a large number of physical processes. They are key tools in quantifying the relative contribution of potential anthropogenic causes (e.g., the current increase in greenhouse gases) on high impact atmospheric variables like heavy rainfall. These so-called climate extreme event attribution problems are particularly challenging in a multivariate context, that is, when the atmospheric variables are measured on a possibly high-dimensional grid. In this paper, we leverage two statistical theories to assess causality in the context of multivariate extreme event attribution. As we consider an event to be extreme when at least one of the components of the vector of interest is large, extreme-value theory justifies, in an asymptotical sense, a multivariate generalized Pareto distribution to model joint extremes. Under this class of distributions,…

Tables1

Table 1. Table 1: Return periods T 𝑇 T in years of the event { w 1 X 1 + w 2 X 2 > u T } subscript 𝑤 1 subscript 𝑋 1 subscript 𝑤 2 subscript 𝑋 2 subscript 𝑢 𝑇 \{w_{1}X_{1}+w_{2}X_{2}>u_{T}\} with u T = log ⁡ T subscript 𝑢 𝑇 𝑇 u_{T}=\log T for T ∈ 𝑇 absent T\in { 10 , 50 , 100 } 10 50 100 \{10,50,100\} for w 1 = w 2 = 0.5 subscript 𝑤 1 subscript 𝑤 2 0.5 w_{1}=w_{2}=0.5 (top) and w 1 = 0.2 , w 2 = 0.8 formulae-sequence subscript 𝑤 1 0.2 subscript 𝑤 2 0.8 w_{1}=0.2,w_{2}=0.8 (bottom). The marginal distributions of X 1 subscript 𝑋 1 X_{1} and X 2 subscript 𝑋 2 X_{2} are unit exponential satisfying P ( X 1 > u T ) = P ( X 2 > u T ) = 1 / T 𝑃 subscript 𝑋 1 subscript 𝑢 𝑇 𝑃 subscript 𝑋 2 subscript 𝑢 𝑇 1 𝑇 P(X_{1}>u_{T})=P(X_{2}>u_{T})=1/T . The multivariate GPD has tail dependence coefficient χ = 0.5 𝜒 0.5 \chi=0.5 , see Section 2.3 .

	$w_{1} = w_{2} = 0.5$
	$T = 10$	$T = 50$	$T = 100$
Complete dependence	10	50	100
Intermediate dependence (MGPD)	19	96	191
Independence	18	283	979
	$w_{1} = 0.2, w_{2} = 0.8$
	$T = 10$	$T = 50$	$T = 100$
Complete dependence	10	50	100
Intermediate dependence (MGPD)	18	88	175
Independence	13	100	237

Equations46

\mbox F A R = 1 - \frac{p _{0}}{p _{1}},

\mbox F A R = 1 - \frac{p _{0}}{p _{1}},

= P [\overline{E} ∣ d o (\overline{C}), C, E], PS = P [E ∣ d o (C), \overline{C}, \overline{E}],

= P [\overline{E} ∣ d o (\overline{C}), C, E], PS = P [E ∣ d o (C), \overline{C}, \overline{E}],

= P [E ∣ d o (C), \overline{E} ∣ d o (\overline{C})],

= max (1 - \frac{p _{0}}{p _{1}}, 0), PS = max (1 - \frac{1 - p _{1}}{1 - p _{0}}, 0),

= max (1 - \frac{p _{0}}{p _{1}}, 0), PS = max (1 - \frac{1 - p _{1}}{1 - p _{0}}, 0),

= max (p_{1} - p_{0}, 0),

E = {w^{T} X > v},

E = {w^{T} X > v},

p_{i} = P [w^{T} X^{(i)} > v], i \in {0, 1} .

p_{i} = P [w^{T} X^{(i)} > v], i \in {0, 1} .

Z^{*} = d E + T - 1 \leq j \leq d max T_{j},

Z^{*} = d E + T - 1 \leq j \leq d max T_{j},

\mathbb{P}[Z^{*}_{j}>z\mid Z^{*}_{j}>0]=\exp(-z),\quad\mbox{ for any $z>0$ and $j\in\{1,\ldots,d\}$.}

\mathbb{P}[Z^{*}_{j}>z\mid Z^{*}_{j}>0]=\exp(-z),\quad\mbox{ for any $z>0$ and $j\in\{1,\ldots,d\}$.}

Z = d \frac{σ}{γ} (exp (γ Z^{*}) - 1),

Z = d \frac{σ}{γ} (exp (γ Z^{*}) - 1),

\displaystyle\mathbb{P}[Z_{j}>z\mid Z_{j}>0]=\overline{H}(z;\sigma_{j},\gamma_{j}),\quad\mbox{ for any $z>0$ and $j\in\{1,\ldots,d\}$,}

\displaystyle\mathbb{P}[Z_{j}>z\mid Z_{j}>0]=\overline{H}(z;\sigma_{j},\gamma_{j}),\quad\mbox{ for any $z>0$ and $j\in\{1,\ldots,d\}$,}

χ

χ

= E [min (\frac{e ^{T_{1} - m a x (T_{1}, T_{2})}}{E [ e ^{T_{1} - m a x (T_{1}, T_{2})} ]}, \frac{e ^{T_{2} - m a x (T_{1}, T_{2})}}{E [ e ^{T_{2} - m a x (T_{1}, T_{2})} ]})] .

[w^{T} Z ∣ w^{T} Z > 0] \sim GPD (w^{T} σ, γ) .

[w^{T} Z ∣ w^{T} Z > 0] \sim GPD (w^{T} σ, γ) .

p_{i} (v) \approx P [X^{(i)} > u^{(i)}] \overline{H} (v - u^{(i)}; σ^{(i)}, γ^{(i)}), for i \in {0, 1} .

p_{i} (v) \approx P [X^{(i)} > u^{(i)}] \overline{H} (v - u^{(i)}; σ^{(i)}, γ^{(i)}), for i \in {0, 1} .

⎩ ⎨ ⎧ \mathbbm 1 {γ^{(0)} < γ^{(1)}} 1 - \frac{p _{0} ( u ^{(0)} )}{p _{1} ( u ^{(1)} )} (\frac{σ ^{(0)}}{σ ^{(1)}})^{1/ γ} \mathbbm 1 {σ^{(0)} < σ^{(1)}} if γ^{(1)} \neq = γ^{(0)}, if γ^{(1)} = γ^{(0)} =: γ, γ \neq = 0, if γ^{(1)} = γ^{(0)} = 0.

⎩ ⎨ ⎧ \mathbbm 1 {γ^{(0)} < γ^{(1)}} 1 - \frac{p _{0} ( u ^{(0)} )}{p _{1} ( u ^{(1)} )} (\frac{σ ^{(0)}}{σ ^{(1)}})^{1/ γ} \mathbbm 1 {σ^{(0)} < σ^{(1)}} if γ^{(1)} \neq = γ^{(0)}, if γ^{(1)} = γ^{(0)} =: γ, γ \neq = 0, if γ^{(1)} = γ^{(0)} = 0.

P [w^{T} X > v]

P [w^{T} X > v]

\approx P [w^{T} X > w^{T} u] \overline{H} (v - w^{T} u; w^{T} σ, γ),

PN (v, w) = max (1 - \frac{P [ w ^{T} X ^{(0)} > v ]}{P [ w ^{T} X ^{(1)} > v ]}, 0),

PN (v, w) = max (1 - \frac{P [ w ^{T} X ^{(0)} > v ]}{P [ w ^{T} X ^{(1)} > v ]}, 0),

R = \frac{( σ _{1}^{(0)} σ _{2}^{(1)} - σ _{2}^{(0)} σ _{1}^{(1)} ) ( σ _{1}^{(0)} σ _{2}^{(1)} - σ _{2}^{(0)} σ _{1}^{(1)} + γ v { ( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} ) } )}{( σ _{2}^{(1)} - σ _{1}^{(1)} ) ( σ _{2}^{(0)} - σ _{1}^{(0)} ) { ( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} ) } ^{2}},

R = \frac{( σ _{1}^{(0)} σ _{2}^{(1)} - σ _{2}^{(0)} σ _{1}^{(1)} ) ( σ _{1}^{(0)} σ _{2}^{(1)} - σ _{2}^{(0)} σ _{1}^{(1)} + γ v { ( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} ) } )}{( σ _{2}^{(1)} - σ _{1}^{(1)} ) ( σ _{2}^{(0)} - σ _{1}^{(0)} ) { ( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} ) } ^{2}},

w_{\pm} (v) = \frac{σ _{2}^{(1)} - σ _{2}^{(0)}}{( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} )} \pm R .

w_{\pm} (v) = \frac{σ _{2}^{(1)} - σ _{2}^{(0)}}{( σ _{2}^{(1)} - σ _{1}^{(1)} ) - ( σ _{2}^{(0)} - σ _{1}^{(0)} )} \pm R .

⎩ ⎨ ⎧ 1 - \frac{H ( v ; w σ _{1}^{(0)} + ( 1 - w ) σ _{2}^{(0)} , γ )}{H ( v ; w σ _{1}^{(1)} + ( 1 - w ) σ _{2}^{(1)} , γ )} ⎭ ⎬ ⎫ .

⎩ ⎨ ⎧ 1 - \frac{H ( v ; w σ _{1}^{(0)} + ( 1 - w ) σ _{2}^{(0)} , γ )}{H ( v ; w σ _{1}^{(1)} + ( 1 - w ) σ _{2}^{(1)} , γ )} ⎭ ⎬ ⎫ .

p_{i}^{(emp)} (v; w) = \frac{1}{n} t = 1 \sum n \mathbbm 1 {w^{T} X_{t}^{(i)} > v} .

p_{i}^{(emp)} (v; w) = \frac{1}{n} t = 1 \sum n \mathbbm 1 {w^{T} X_{t}^{(i)} > v} .

p_{i} (v; w) = {p_{i}^{(emp)} (v; w) p_{i}^{(emp)} (w^{T} u^{(i)}; w) \overline{H} (v - w^{T} u^{(i)}; w^{T} σ^{(i)}, γ^{(i)}) if v \leq w^{T} u^{(i)}, if v \leq w^{T} u^{(i)} .

p_{i} (v; w) = {p_{i}^{(emp)} (v; w) p_{i}^{(emp)} (w^{T} u^{(i)}; w) \overline{H} (v - w^{T} u^{(i)}; w^{T} σ^{(i)}, γ^{(i)}) if v \leq w^{T} u^{(i)}, if v \leq w^{T} u^{(i)} .

d_{s t} = \frac{1 - χ _{s t}}{2 ( 3 - χ _{s t} )},

d_{s t} = \frac{1 - χ _{s t}}{2 ( 3 - χ _{s t} )},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Climate extreme event attribution using

multivariate peaks-over-thresholds modeling and counterfactual theory

Anna Kiriliouklabel=e1][email protected] [

Philippe Naveaulabel=e2][email protected] [ Université de Namur\thanksmarkm1 and CNRS (Gif-sur-Yvette) \thanksmarkm2

Abstract

Numerical climate models are complex and combine a large number of physical processes. They are key tools in quantifying the relative contribution of potential anthropogenic causes (e.g., the current increase in greenhouse gases) on high impact atmospheric variables like heavy rainfall. These so-called climate extreme event attribution problems are particularly challenging in a multivariate context, that is, when the atmospheric variables are measured on a possibly high-dimensional grid.

In this paper, we leverage two statistical theories to assess causality in the context of multivariate extreme event attribution. As we consider an event to be extreme when at least one of the components of the vector of interest is large, extreme-value theory justifies, in an asymptotical sense, a multivariate generalized Pareto distribution to model joint extremes. Under this class of distributions, we derive and study probabilities of necessary and sufficient causation as defined by the counterfactual theory of Pearl. To increase causal evidence, we propose a dimension reduction strategy based on the optimal linear projection that maximizes such causation probabilities. Our approach is tested on simulated examples and applied to weekly winter maxima precipitation outputs of the French CNRM from the recent CMIP6 experiment.

multivariate generalized Pareto distribution,

necessary and sufficient causation,

heavy rainfall,

climate change,

keywords:

\startlocaldefs\endlocaldefs

and

t1Part of this work was supported by the French national program FRAISE-LEFE/INSU, MELODY-ANR, Eupheme, FUSIMET (PEPS I3A) and the DAMOCLES-COST-ACTION on compound events.

1 Introduction

Quantifying human influence on climate change and identifying potential causes of climate extremes is often referred to as extreme event attribution (EEA), which falls within the research field of detection and attribution (D&A) (see, e.g. the report of National Academies of Sciences, Engineering and Medicine, 2016; Chen et al., 2018). In such studies, the main inferential objective is estimation of extreme quantiles (also called return levels), well-used in finance, hydrology and other fields of risk analysis (e.g. Embrechts, Klüppelberg and Mikosch, 1997). In EEA, return levels are computed under two scenarios that differ according to the causal link of interest, often increases in greenhouse gases (GHG) concentrations (see, e.g. Angélil, Stone and Wehner, 2017; Stott et al., 2016; Fischer and Knutti, 2015). Typically, such an approach compares probabilities under a factual scenario of conditions that occurred around the time of the event against probabilities under a counterfactual scenario in which anthropogenic emissions never occurred. More specifically, one compares the probability of an extreme event in the factual world, denoted $p_{1}$ , to the probability of an extreme event $p_{0}$ in a counterfactual world, i.e., a world that might have been if no anthropogenic forcing would have been present. The definition of the so-called extreme event is by itself a non-trivial task and depends on the application at hand. A common choice is to take some climatological index exceeding a high threshold. In their seminal paper, Stott, Stone and Allen (2004) studied mean June–August temperatures in Europe in order to quantify by how much human activities may have increased the risk of European heatwaves. In this example, a one dimensional sample mean, say $X$ , summarized a complex random field that varied in time and space. The set $\{X>v\}$ where $v=1.6$ Kelvin was chosen to resemble the 2003 mean European summer anomaly temperatures. The probabilities $p_{0}$ and $p_{1}$ were then inferred from numerical counterfactual and factual runs respectively, using nonparametric inference techniques (for bootstrap counting methods in EEA, see Paciorek, Stone and Wehner, 2018) and univariate extreme-value theory (EVT); see, e.g. Coles (2001) for an introduction.

In other environmental research areas, complex multivariate EVT models are commonly used (see, e.g. Davison and Huser, 2015; Cooley, Hunter and Smith, 2017; de Fondeville and Davison, 2018; Engelke, De Fondeville and Oesting, 2018; Reich, Shaby and Cooley, 2014; Shaby and Reich, 2012). Bayesian hierarchical modeling (see, e.g. Hammerling, Katzfuss and Smith, 2017; Katzfuss, Hammerling and Smith, 2017) also offers a flexible way to insert different layers of complexity present in climate D&A problems (internal natural variability, numerical model uncertainty, observational errors, sampling uncertainty in space and time, etc.). Despite these advances, the EEA domain remains a fairly untouched territory in terms of multivariate EVT. Even recent papers like Kew et al. (2019); Luu et al. (2018); Otto et al. (2018) and King (2017) are based on univariate EVT only.

Still, in climatological studies, there is no reason to assume that the dependence structure remains identical over space. For example, heavy rainfall in convective prone regions will spatially differ from regions driven by frontal storms. Examples like the one shown in Table 1 below underscore that the practitioner choice of the region under study and the risk measure used can lead to very different return periods. In addition, the essence of EEA studies is to detect changes between return periods in factual and counterfactual worlds (see, e.g., Naveau, Hannart and Ribes, 2020, for a recent review on the statistical aspects of this problem). If the spatial dependence structure differs between the factual and counterfactual worlds (e.g., a region could be more prone to convective storms in its factual version), it has an influence on the return level. Hence, past studies based on univariate EVT measures may have overlooked some causal signals by not taking into account the underlying multivariate structure.

Our first objective is to investigate how multivariate EVT could be used for EEA. As extreme events in D&A are mostly expressed in terms of threshold exceedances, like $\{X>v\}$ in Stott, Stone and Allen (2004), this naturally leads to the question of how to integrate the multivariate generalized Pareto distribution (GPD) introduced by Tajvidi (1996) into the EEA framework. This distribution has been tailored to represent extremal behaviors when at least one of the components of the vector of interest is large. The probabilistic properties of the multivariate GPD have been well studied by, among others, Beirlant et al. (2004); Rootzén and Tajvidi (2006); Falk and Guillou (2008); Ferreira and de Haan (2014); Rootzén, Segers and Wadsworth (2018a) and Rootzén, Segers and Wadsworth (2018b), while statistical modeling is more recent (Huser, Davison and Genton, 2016; Kiriliouk et al., 2019).

In most univariate EEA studies (see Stott et al., 2016, and references therein), two types of probability ratios are considered: the Risk Ratio $\frac{p_{1}}{p_{0}}$ and the so-called Fraction of Attributable Risk (FAR), defined by

[TABLE]

where $p_{0}=\mathbb{P}(X>v)$ corresponds to the probability of exceeding the threshold $v$ in the counterfactual world, while $p_{1}$ represents the same quantity in the factual world. Using the counterfactual theory of Pearl (2000), the FAR corresponds to the probability of necessary causation, i.e., anthropogenic forcings are necessary for the extreme event to occur, but might not be sufficient. Within the Gaussian set-up, Hannart et al. (2016) and Hannart and Naveau (2018) highlighted the link between causality theory and event attribution studies. The second objective of our work is to explain how Pearl’s counterfactual theory can be applied within a multivariate GPD framework, and to identify conditions that maximize the probability of causality, a fundamental feature in any EEA analysis.

The rest of the paper is structured as follows. Section 2 summarizes the relevant background of EEA and the multivariate GPD. Section 3 discusses the behaviour of univariate probabilities of causation as a function of the threshold $v$ . In Section 4, we make the link between multivariate GPDs and causality by maximizing necessary causation for any linear projection and we discuss the inference strategy. Finally, in Section 5, the proposed methods are applied to weekly winter maxima of precipitation outputs from the French CNRM model that are part of the recent CMIP6 experiment. A discussion and outlook is given in Section 6. Technical details are deferred to the supplementary material (Kiriliouk and Naveau, 2020).

2 Background

2.1 Climate event attribution and counterfactual theory

The question of attribution in EEA is inherently rooted in causality assessment. Pearl (2000) proposed a framework to connect the probabilities $p_{0}$ and $p_{1}$ to different forms of causality. If $E$ denotes an event (e.g. the 2003 European heatwave) and $C$ its potential cause (e.g. the increase of GHG emissions), three distinct forms of causality are of interest:

probability of necessary causation (PN): $C$ is required for $E$ but other factors might be required as well; 2. 2.

probability of sufficient causation (PS): $C$ always triggers $E$ but $E$ might occur without $C$ ; 3. 3.

probability of necessary and sufficient causation (PNS): both of the above hold.

Hannart et al. (2016) recalled the mathematical definition of these probabilities,

[TABLE]

where $do(C)$ means that the cause is interventionnaly forced to occur. For climate EEA, the cause $C$ can be defined as the presence of anthropogenic forcings. In this setting, Hannart et al. (2016) exploited that $E$ is monotonous with respect to $C$ (the event is more likely when $C$ is present) and $C$ is exogenous (i.e., it is not influenced by any other observed variables). The causation probabilities then simplify to

[TABLE]

where $p_{0}=\mathbb{P}[E\mid\overline{C}]$ corresponds to the probability of $E$ in the counterfactual world and $p_{1}=\mathbb{P}[E\mid C]$ to the probability of the same event in the factual world. If $p_{0}<p_{1}$ , the PN coincides with the FAR used by Stott, Stone and Allen (2004). In the remainder of this paper, we will use the notation PN (instead of FAR) to highlight its causal interpretation. By construction, one has PNS $\leq$ $\min($ PN,PS $)$ and hence it is worth noticing that a low PNS does not imply the absence of a causal relationship.

A fundamental step in any causality analysis is the definition of the event $E$ . In EEA, the event $E$ is classically defined as the occurrence of some climatological index (e.g. a spatial average over a given region) exceeding a high threshold $v$ ,

[TABLE]

where $\bm{X}=(X_{1},\dots,X_{d})^{T}$ is a random vector defined on $d$ grid points and $\bm{w}=(w_{1},\ldots,w_{d})^{T}$ are non-negative weights. Let $\bm{X}^{(0)}$ and $\bm{X}^{(1)}$ denote the climatological vector $\bm{X}$ in the counterfactual and the factual world respectively. Hence, the probabilities $p_{i}$ in (2.1) become

[TABLE]

Generally, $\bm{w}^{T}\bm{X}^{(i)}$ is modelled as a univariate random variable (see, e.g. Kew et al., 2019; Luu et al., 2018; Otto et al., 2018; King, 2017) and the dependence among the components of $\bm{X}^{(i)}$ is ignored. One objective of this paper is to explore how the multivariate dependence among the $\bm{X}^{(i)}$ ’s affects the values $p_{i}$ , and consequently the causal evidence expressed with PN, PS or PNS, especially if the weights are well chosen. In the next section, we present a simple example to gain some intuition on the difference in return levels between univariate and multivariate modeling.

2.2 Impact of multivariate extremal modeling on return periods

In this section, we illustrate how the dependence structure of $\bm{X}$ impacts the return periods of $\bm{w}^{T}\bm{X}$ . For clarity’s sake, we assume that the margins $X_{i}$ of $\bm{X}$ follow unit exponential distributions (in line with the multivariate EVT model of Section 2.3). Hence, the associated return level at each grid point is simply $u_{T}=\log T$ for a given return period $T$ , i.e., $\mathbb{P}(X_{i}>u_{T})=1/T$ for $i=1,\ldots,d$ . If the $X_{i}$ ’s exhibit complete dependence, i.e., $X_{i}=X_{j}$ with probability 1 for all $i,j$ , the return time of $\bm{w}^{T}\bm{X}$ is identical to that of $X_{i}$ , and is simply equal to $T$ . In the opposite case, the margins of $\bm{X}$ are independent and by construction, $\bm{w}^{T}\bm{X}$ has unit mean and variance $w^{2}_{1}+\dots+w^{2}_{d}$ . If $w_{i}=1/d$ for all $i$ , the return time of $\bm{w}^{T}\bm{X}$ associated with $u_{T}$ corresponds to the quantile of a gamma distribution. The first and third rows in Table 1 show that the return periods of the event $\{w_{1}X_{1}+w_{2}X_{2}>u_{T}\}$ are much larger than the ones obtained in the complete dependence case. This effect increases as the time period increases and/or the dimension $d$ increases (tables available upon request). Hence, given the same univariate variable $\bm{w}^{T}\bm{X}$ and the same return level $u_{T}=\log T$ , the degree of dependence in the original data greatly influences the return period of the same event. As the field of EEA is rooted in the computation of events like $\{\bm{w}^{T}\bm{X}>u_{T}\}$ for large $T$ , modeling of multivariate extremal dependence becomes paramount. In the last decades, Rootzén and his colleagues have proposed models and inferential schemes for high return levels that take this dependence into account. The row “Intermediate dependence (MGPD)” in Table 1 concerns such a multivariate model (to be detailed in Section 2.3).

Comparing the upper part of Table 1 with the lower part, we see that the choice of the weights also plays a non-negligible role in the return times values. In EEA, one wishes to contrast the factual and counterfactual worlds. Table 1 shows that differences between $p_{0}=\mathbb{P}[\bm{w}^{T}\bm{X}^{(0)}>v]$ and $p_{1}=\mathbb{P}[\bm{w}^{T}\bm{X}^{(1)}>v]$ not only stem from the spatial dependences structures of $\bm{X}^{(0)}$ and $\bm{X}^{(1)}$ , but also from the weight selection.

2.3 The multivariate generalized Pareto distribution

When $\bm{X}=X\in\mathbb{R}$ , univariate peaks-over-thresholds approaches (Davison and Smith, 1990) consist of choosing a large threshold $u$ and fitting $\left[X-u\mid X>u\right]$ to a univariate GPD. Hence, regardless of the underlying distribution of the climatological index $X$ , the GPD can be used to model causation probabilities of the event $\{X>v\}$ as long as $v>u$ . Similarly, a multivariate GPD approximates the tail behavior of $\left[\bm{X}-\bm{u}\mid\bm{X}\nleq\bm{u}\right]$ , where $\bm{X}\nleq\bm{u}$ means that at least one component of $\bm{X}$ exceeds the corresponding component of the threshold $\bm{u}\in\mathbb{R}^{d}$ . In the following, we will see how it can be used to model causation probabilities of the event $\{\bm{w}^{T}\bm{X}>v\}$ for $v>\bm{w}^{T}\bm{u}$ .

From a mathematical point of view, multivariate GPD vectors can be viewed as the limiting solution of any linearly rescaled multivariate vector given that at least one component is large. This asymptotic result can be interpreted as a multivariate version of the Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974). For the sake of clarity and concision, we will only recall how multivariate GPDs can be simulated and how basic principles are derived from this stochastic definition. The reader interested by theoretical aspects of multivariate GPDs is referred to Rootzén and Tajvidi (2006).

The basic building block to construct standardized multivariate GPD vectors (Rootzén, Segers and Wadsworth, 2018a) is the stochastic representation

[TABLE]

where $E$ is a unit exponential random variable and $\bm{T}=(T_{1},\dots,T_{d})^{T}$ represents any $d$ -dimensional random vector independent of $E$ . One can check that each positive conditional margin has a unit exponential survival function,

[TABLE]

Model (2.2) can be generalized by setting, for $\bm{\sigma}>\bm{0}$ and $\bm{\gamma}\in\mathbb{R}^{d}$ ,

[TABLE]

where operations like $\frac{\bm{\sigma}}{\bm{\gamma}}$ have to be understood componentwise. We then denote $\bm{Z}\sim\textnormal{MGPD}(\bm{T},\bm{\sigma},\bm{\gamma})$ . Equation (2.3) implies

[TABLE]

where $\overline{H}(z;\sigma,\gamma)=(1+\gamma z/\sigma)^{-1/\gamma}_{+}$ denotes the survival function of the univariate GPD with scale parameter $\sigma>0$ and shape parameter $\gamma\in\mathbb{R}$ . Hence, the conditional margins $\left[Z_{j}\mid Z_{j}>0\right]$ follow univariate GPDs111although the random variables $Z_{1},\ldots,Z_{d}$ may not follow GPDs themselves. .

The random “generator” $\bm{T}$ in (2.2) drives the extremal dependence of $\bm{Z}$ , often summarized by the tail dependence coefficient $\chi\in[0,1]$ (for $d=2$ ). If $F_{1}$ , $F_{2}$ denote the unconditional marginal distribution functions of $Z_{1},Z_{2}$ , then $\chi$ measures the probability of $F_{1}(Z_{1})$ being large given that $F_{2}(Z_{2})$ is large as the threshold increases,

[TABLE]

A large value of $\chi$ corresponds to strong extremal dependence between $Z_{1}$ and $Z_{2}$ , whereas $\chi=0$ corresponds to tail independence. For more details on $\chi$ in the context of multivariate GPDs, see the supplementary material in Kiriliouk et al. (2019). As an example, Figure 1 displays 500 bivariate random draws from a multivariate GPD model where $\bm{T}$ is zero-mean bivariate Gaussian with unit covariance matrix $I_{2}$ , corresponding to $\chi=0.6$ .

To make the link with the probabilities $p_{0}$ and $p_{1}$ used for EEA, we need a tool to project the information contained in a possibly complex multivariate GPD structure into a single valued summary. The following proposition provides this key tool.

Proposition 2.1 (Linear-projection, Rootzén, Segers and

Wadsworth (2018a)).

If $\bm{Z}\sim\textnormal{MGPD}(\bm{T},\bm{\sigma},\bm{\gamma})$ with $\bm{\gamma}=\gamma\bm{1}_{d}$ , then for any non-negative weights $\bm{w}=(w_{1},\ldots,w_{d})^{T}$ such that $\mathbb{P}[\bm{w}^{T}\bm{Z}>0]>0$ , the linear projection of $\bm{Z}$ , conditioned on being positive, follows a univariate GPD, i.e.,

[TABLE]

As explained in Section 2.1, our main inferential objective will be to estimate the probability $\mathbb{P}[\bm{w}^{T}\bm{X}>v]$ . A priori, if $\bm{X}$ represents a climatological index on $d$ grid points, it does not follow a multivariate GPD, but, since $\bm{X}-\bm{u}\mid\bm{X}\nleq\bm{u}\approx\bm{Z}$ , we will use Proposition 2.1 to deduce a suitable model for $\bm{X}$ in Section 4.

3 Causation probabilities for univariate extremes

To understand first how PN, PS and PNS behave for univariate extremes, we take $p_{0}(v)=\mathbb{P}[X^{(0)}>v]$ and $p_{1}(v)=\mathbb{P}[X^{(1)}>v]$ when $X^{(0)}$ and $X^{(1)}$ are either Gaussian or GPD random variables. The left panel of Figure 2 shows the case where the counterfactual world corresponds to a standardized Gaussian variable, $X^{(0)}\sim N(0,1)$ , and the factual world is one Kelvin warmer with a higher variability, $X^{(1)}\sim N(1,1.5)$ . This artificial design mimics the typical behaviour of mean temperature anomalies. Two features can be highlighted from this example: PN goes to one as $v$ increases, and the maximum of PNS is around $0.35$ . In other words, the probability of necessary causation becomes certain for extremes (large $v$ ), and the probability of necessary and sufficient causation can be reasonably high in the Gaussian case.

To contrast these remarks with other types of tail behaviors, the middle panel of Figure 2 displays a GPD case with equal shape parameter $\gamma=0.2$ in the counterfactual and factual worlds, but different scale parameters, $\sigma^{(0)}=1$ and $\sigma^{(1)}=1.5$ . One can see that, as $v$ increases, PN converges to a constant around $0.7<1$ , and PNS remains small for any value of $v$ . Hence, causal evidencing is much more difficult than in the Gaussian set-up, where a rare event in the factual world ( $p_{1}$ small) would be nearly impossible in the counterfactual world ( $p_{0}$ almost zero). In contrast, even a very rare event in the factual world will not be impossible in a GPD counterfactual world. Concerning PNS, it is small in the second panel and this phenomenon is even more pronounced when the shape parameter changes between the two worlds; see the right panel where $\gamma^{(0)}=0.1$ , $\gamma^{(1)}=0.3$ , and $\sigma^{(0)}=\sigma^{(0)}=1$ . As PNS is near zero for all $v$ , there is no reason to maximize it. Instead, maximizing causality will correspond to maximizing PN in the remainder of this work.

In practice, $X^{(0)}$ and $X^{(1)}$ do not follow GPDs. Using a classical peaks-over-thresholds approach, we can condition on some high threshold $u^{(i)}$ and approximate the probabilities $p_{i}(v)=\mathbb{P}[X^{(i)}>v]$ for $v>u^{(i)}$ by

[TABLE]

We can now formalize the tail behaviour observed in Figure 2. Whenever the limit of PN( $v$ ) for large $v$ is finite222Degenerate cases can occur when $p_{1}(v)=0$ . For example, if $\gamma^{(1)}<0$ , the PN is not defined for $v\geq u^{(1)}-\sigma^{(1)}/\gamma^{(1)}$ , which is visible for the dashed line in the left panel of Figure 3., it has to be equal to

[TABLE]

where $\mathbbm{1}(A)$ represents the indicator function, equal to one if $A$ is true and zero otherwise.

The left panel of Figure 3 shows how three different GPD shape parameters, $\gamma=-0.4$ , $\gamma=0$ and $\gamma=0.4$ (dashed, solid and dotted lines respectively) influence the increase of the PN with respect to the threshold $v$ . The right panel of Figure 3 points out possible atypical behaviours of the PN, highlighting that PN $(v)$ is not always increasing as $v$ increases. Here, the solid line corresponds to a counterfactual world with $(\gamma^{(0)},\sigma^{(0)})=(0,2)$ compared to a factual world with $(\gamma^{(1)},\sigma^{(1)})=(0.4,1)$ , while the dotted line represents the converse change: from $(\gamma^{(0)},\sigma^{(0)})=(0.4,1)$ to $(\gamma^{(1)},\sigma^{(1)})=(0,2$ ).

4 Necessary causation in a multivariate set-up

4.1 The multivariate GPD and necessary causation

Let $\bm{X}$ be any $d$ -dimensional random vector such that $\left[\bm{X}-\bm{u}\mid\bm{X}\nleq\bm{u}\right]\approx\bm{Z}$ for some high threshold $\bm{u}\in\mathbb{R}^{d}$ , where $\bm{Z}\sim\textnormal{MGPD}(\bm{T},\bm{\sigma},\bm{\gamma})$ . Then, according to Proposition 2.1, the extremal information contained in any linear projection $\bm{w}^{T}\bm{X}$ can be approximated, up to a normalizing constant, by a univariate GPD survival function. More precisely, for any $v>\bm{w}^{T}\bm{u}$ , we can write

[TABLE]

for $w_{1}+\dots+w_{d}=1$ and $\bm{\gamma}=\gamma\bm{1}_{d}$ . Constraining the weights to sum to a constant is necessary to ensure identification of $\bm{\sigma}$ . The condition $\bm{\gamma}=\gamma\bm{1}_{d}$ implies that conditional marginal distributions $\left[Z_{j}\mid Z_{j}>0\right]$ have equal shape parameters for $j=1,\dots,d$ . Therefore, homogeneous spatial regions (in terms of the shape parameter) have to be identified in practice. This is closely related to the regional frequency analysis problem treated in hydrology Carreau, Naveau and Neppel (2017). Finally, we note that the dependence structure of $\bm{X}$ is present in the term $\mathbb{P}\left[\bm{w}^{T}\bm{X}>\bm{w}^{T}\bm{u}\right]$ only.

Any linear projection in the factual and counterfactuals worlds, denoted $p_{1}(v;\bm{w})=\mathbb{P}[\bm{w}^{T}\bm{X}^{(1)}>v]$ and $p_{0}(v;\bm{w})=\mathbb{P}[\bm{w}^{T}\bm{X}^{(0)}>v]$ respectively, can now be used to compute a probability of necessary causation that depends on the weight $\bm{w}$ and the dependence structure of $\bm{X}^{(0)}$ and $\bm{X}^{(1)}$ ,

[TABLE]

where $\bm{X}^{(i)}$ satisfies approximation (4.1) for $i\in\{0,1\}$ . To understand how dependence affects the strength of necessary causation, we study the value of $\textnormal{PN}(v,\bm{w})$ in the bivariate case with $\bm{w}=(0.5,0.5)$ , $\bm{X}^{(0)}\overset{\mathrm{d}}{=}\bm{Z}^{(0)}\sim\textnormal{MGPD}\left(\bm{T}^{(0)},\bm{\sigma}^{(0)},\bm{0}\right)$ and $\bm{X}^{(1)}\overset{\mathrm{d}}{=}\bm{Z}^{(1)}\sim\textnormal{MGPD}(\bm{T}^{(1)},\bm{\sigma}^{(1)},\bm{0})$ . In the example displayed in Figure 4, two different dependence structures are investigated, summarized by the tail dependence coefficient $\chi$ . The dotted line corresponds to increasing dependence, from $\chi^{(0)}=0.3$ to $\chi^{(1)}=0.5$ and an increasing marginal scale, from $\bm{\sigma}^{(0)}=(1,1)$ to $\bm{\sigma}^{(1)}=(2,2)$ . The dashed line again represents increasing dependence, but decreasing marginal scale, from $\bm{\sigma}^{(0)}=(2,2)$ to $\bm{\sigma}^{(1)}=(1,1)$ . Finally, the solid line shows increasing marginal scale of the same order as the dotted line, and decreasing dependence, from $\chi^{(0)}=0.5$ to $\chi^{(1)}=0.3$ .

Figure 4 shows that the dependence structure can have an impact on the PN for any finite value of $v$ . In other words, EEA based on a hypothesis of independence (eg, in space) will lead to incomplete statements concerning the strength of PN whenever the multivariate extremes are dependent. Figure 4 also suggests that, as $v$ increases, the impact of an increasing dependence in the factual world becomes negligible. However, it is important to keep in mind that in applications, the marginal scale might be constant between the two worlds. In that case, we will see the impact of increasing dependence in Figure 6.

4.2 Maximizing necessary causation

In a multivariate Gaussian set-up, Hannart and Naveau (2018) proposed to maximize causation probabilities by using the linear projection that will contrast the factual and counterfactual worlds the most. Their solution was similar to linear discriminant analysis. This leads to the question of how to reduce the dimension of a multivariate GPD vector, while ensuring that the projected data contains the most information in terms of causality for extremes.

More specifically, the choice of $\bm{w}$ plays an essential role in the maximization of necessary causation for multivariate GPD random variables. To address this point in the bivariate case, we need the following result.

Proposition 4.1.

Let $\gamma\in\mathbb{R}$ and consider two positive bivariate scale parameters: $\bm{\sigma}^{(0)}=(\sigma_{1}^{(0)},\sigma_{2}^{(0)})^{T}$ and $\bm{\sigma}^{(1)}=(\sigma_{1}^{(1)},\sigma_{2}^{(1)})^{T}$ . Denote

[TABLE]

and if $R>0$ , define the weights

[TABLE]

If $R\geq 0$ and if one of the two weights $w_{\pm}(v)$ belongs to $(0,1)$ , then this weight, denoted $w_{\textnormal{opt}}$ , maximizes

[TABLE]

In all other cases, only zero or unit weights maximize this ratio.

When $\gamma=0$ , $w_{\pm}(v)$ is simpler because it does not depend on $v$ . Expression (4.3) is an approximation of the PN defined in (4.2); it is equal to the PN when $\bm{X}^{(0)},\bm{X}^{(0)}$ are multivariate GPDs and when $\mathbb{P}[\bm{w}^{T}\bm{Z}^{(0)}>0]=\mathbb{P}[\bm{w}^{T}\bm{Z}^{(1)}>0]$ , i.e., when the dependence structure remains constant between the two worlds. Proposition 4.1 allows us to study the gain in terms of PN with respect to the weight $w$ . When unit or zero weights are chosen as the optimal solution in Proposition 4.1, only one coordinate is considered and no linear projection is necessary. This happens when the contrast in one of the margins between the factual and counterfactual world is already sufficient to optimize PN. However, Proposition 4.1 shows that, to maximize necessary causality, one needs to consider only those components that (individually) give the largest PN. As an example, take $\bm{X}^{(0)}\overset{\mathrm{d}}{=}\bm{Z}^{(0)}\sim\textnormal{MGPD}(\bm{T}^{(0)},(1,2)^{T},\gamma\bm{1}_{d})$ and $\bm{X}^{(1)}\overset{\mathrm{d}}{=}\bm{Z}^{(1)}\sim\textnormal{MGPD}(\bm{T}^{(1)},(1.5,2)^{T},\gamma\bm{1}_{d})$ . The dependence structures of $\bm{T}^{(1)}$ and $\bm{T}^{(0)}$ are chosen such that $\chi^{(0)}=\chi^{(1)}=0.5$ . Hence, the difference between the two worlds is only due to the scale change in one of the components. Figure 5 shows the PN gain as a function of $v$ , i.e., the ratio between $\textnormal{PN}(v,(0.5,0.5)^{T})$ and $\textnormal{PN}(v,(w_{\textnormal{opt}},1-w_{\textnormal{opt}})^{T})$ where $w_{\textnormal{opt}}=1$ based on Proposition 4.1. Each curve corresponds to a different shape parameter (with constraint $\gamma^{(0)}=\gamma^{(1)}$ ), equal to $-0.4$ (dashed line), [math] (solid line), and $0.4$ (dotted line), respectively. We see that the optimal weight can lead to a large increase in necessary causality, particularly when the shape parameter is positive (dotted line).

Explicit optimal weights like in Proposition 4.1 can only be obtained in very specific cases. For example, for the bivariate Gaussian GPD with $\gamma=1$ , the probability $\mathbb{P}[\bm{w}^{T}\bm{Z}^{(i)}>0]$ does not depend on $\bm{w}$ (see Proposition A.1 in the supplementary material, Kiriliouk and Naveau (2020)). For most other cases, numerical optimization schemes have to be used, especially beyond the bivariate set-up. In order to move closer to practical applications, we need to couple this optimization procedure with inference in a multivariate context.

4.3 Inference

Let $\bm{X}^{(0)}_{1},\ldots,\bm{X}^{(0)}_{n}$ and $\bm{X}^{(1)}_{1},\ldots,\bm{X}^{(1)}_{n}$ denote two independent samples of size $n$ , representing climate model output in the counterfactual and the factual world respectively, and let $\bm{u}^{(0)}$ , $\bm{u}^{(1)}$ denote two high thresholds. For $i\in\{0,1\}$ , let $N_{i}$ denote the number of observations among $\bm{X}^{(i)}_{1},\ldots,\bm{X}^{(i)}_{n}$ that have at least one component exceeding $\bm{u}^{(i)}$ . Extracting these observations and subtracting $\bm{u}^{(i)}$ , we obtain the multivariate GPD samples $\bm{Z}_{1}^{(i)},\ldots,\bm{Z}^{(i)}_{N_{i}}$ . For $v>\bm{w}^{T}\bm{u}^{(i)}$ , an estimator of $p_{i}(v;\bm{w})=\mathbb{P}[\bm{w}^{T}\bm{X}^{(i)}>v]$ and hence of the PN follows from approximation (4.1). The first term, $\mathbb{P}\left[\bm{w}^{T}\bm{X}^{(i)}>\bm{w}^{T}\bm{u}^{(i)}\right]$ , can be estimated nonparametrically by

[TABLE]

To estimate the second term, $\overline{H}\left(v-\bm{w}^{T}\bm{u}^{(i)};\bm{w}^{T}\bm{\sigma}^{(i)},\gamma^{(i)}\right)$ , we first compute estimators $\widehat{\bm{\sigma}}^{(i)}$ and $\widehat{\bm{\gamma}}^{(i)}$ by applying the method of probability weighted moments to $(Z^{(i)}_{tj}\mid Z^{(i)}_{tj}>0)_{t=1,\ldots,N_{i}}$ for $j\in\{1,\ldots,d\}$ (see Section C in the supplementary material, Kiriliouk and Naveau (2020)). Next, we set $\widehat{\gamma}^{(i)}=d^{-1}\sum_{j=1}^{d}\widehat{\gamma}_{j}^{(i)}$ 333An alternative method to enforce equal shape parameters is described in Carreau, Naveau and Neppel (2017). Finally, we estimate $p_{i}(v;\bm{w})$ by

[TABLE]

Alternatively, we could directly estimate $\gamma^{(i)}$ and $\bm{w}^{T}\bm{\sigma}^{(i)}$ by applying the method of probability weighted moments to $(\bm{w}^{T}\bm{Z}_{t}^{(i)}\mid\bm{w}^{T}\bm{Z}^{(i)}_{t}>0)_{t=1,\ldots,N_{i}}$ , which reduces uncertainty and enforces the constraint of equal shape parameters, but requires the weights $\bm{w}$ to be chosen upfront. Section D in the supplementary material (Kiriliouk and Naveau, 2020) shows a small simulation experiment, confirming the good performance of $\widehat{\textnormal{PN}}=1-\widehat{p}_{0}/\widehat{p}_{1}$ .

In the previous sections, we studied the increase in PN for changing dependence structures and marginal parameters. Another important question is what happens when marginal parameters do not change ( $\bm{\sigma}^{(0)}=\bm{\sigma}^{(1)}$ and $\bm{\gamma}^{(0)}=\bm{\gamma}^{(1)}$ ), while dependence increases ( $\chi^{(1)}>\chi^{(0)}$ ). Under a hypothesis of independence in space, one would aggregate the observations from all grid points (i.e., calculate $\bm{w}^{T}\bm{X}^{(i)}$ ) and estimate the univariate PN, thus possibly underestimating the true PN. To see by how much, and how the result varies with the dimension, we conduct the following experiment. Consider $d=9$ points on a regular $3\times 3$ unit distance grid. For distances from $1$ to $\sqrt{8}$ , pairwise tail dependence coefficients ranging from $0.4$ to $0.3$ for the counterfactual world and from $0.55$ to $0.4$ for the factual world were obtained using a Whittle-Matérn correlation function444The covariance matrices $\Sigma^{(0)}$ and $\Sigma^{(1)}$ are generated using a Whittle-Matérn correlation function with fixed shape $\kappa^{(0)}=\kappa^{(1)}=1$ and varying scales $\phi^{(0)}=1$ , $\phi^{(1)}=2.5$ . The correlation matrices are then multiplied by 10 to obtain $\Sigma^{(0)}$ and $\Sigma^{(1)}$ . We evaluate the PN in the $99\%$ quantile of $\bm{w}^{T}\bm{Z}^{(0)}$ using equal weights, calculated based on a pre-simulation run of sample size $10^{6}$ and held fixed. Figure 6 shows boxplots of the multivariate estimates $\widehat{\textnormal{PN}}$ minus the univariate estimates, based on 1000 samples of size $n=2000$ . The black line corresponds to the true values, calculated using the formulas in Sections A and B in the supplementary material (Kiriliouk and Naveau, 2020). We see that as the dimension increases, taking dependence into account increases necessary causation.

5 Analysis of heavy precipitation from the CNRM model

Evidencing causality is more difficult for heavy rainfall than for extreme temperatures, because precipitation variability is greater in space and time and because extreme rainfall has heavier tails than temperatures (extreme rainfall often has $\gamma\approx 0.2$ , see, e.g. Katz, Parlange and Naveau (2002)). We work with simulated rainfall time series from the French global climate model of Météo-France (CNRM) that belongs to the latest Coupled Model Intercomparison Project (CMIP6). We consider the winter months between the 1st of January 1985 until the 31st of August 2014 over the region defined by $-10$ to $40$ in longitude and $35$ to $60$ in latitude (corresponding to central Europe). Our factual and counterfactual worlds correspond to two historical runs, the second one of which has only natural forcings. We take the weekly maxima of winter precipitation. As the number of years covers only three decades, the rainfall series can be considered stationary in time within each world. Concerning their spatial structure, we apply the partitioning around medoids (PAM) algorithm (Kaufman and Rousseeuw, 1990) to the counterfactual rainfall run. The difference with the original PAM version is that our “distance” between two locations $s$ and $t$ is tailored to threshold exceedances via

[TABLE]

where $\widehat{\chi}_{st}$ denotes the standard empirical estimator of the pairwise tail dependence coefficient (see, e.g. Kiriliouk, Segers and Warchoł, 2016, and references therein). Our approach is close to the one of Bernard et al. (2013), who focused on maxima instead of threshold exceedances. Figure 7 displays the spatial structure for $K=40$ clusters555Other values of $K$ were tested and provided similar patterns.. Although no spatial coordinates were given to the algorithm, the clusters appear to be spatially homogeneous and climatologically coherent. As the multivariate GPD is tailored for asymptotic dependence, identifying dependent regions helps to improve its fit. In addition, such a spatial clustering makes the assumption of a constant shape parameter within a region more reasonable.

We hence model each cluster independently, calculating the estimated multivariate PN based on $\widehat{p}_{0},\widehat{p}_{1}$ defined in (4.4).

Figures 8 and 9 show necessity causation probabilities for the five-year and fifty-year return level respectively. The return levels were calculated based on quantiles of $\bm{w}^{T}X^{(0)}$ for each cluster, with equal weights. Both figures show the PN per cluster, calculated using equal weights (top) and optimal weights (bottom). The diameters of the black circles around the PN estimates are proportional to the length of bootstrap-based 95% confidence intervals.

Higher PN does not necessarily comes with higher uncertainty; see for instance the cluster around northern Italy, whose confidence interval is narrow for the five-year return level and even more so for the fifty-year return level. Comparing the two panels of Figure 8 shows that the differences between the factual and the counterfactual world are higher when using optimal weights. This feature is even more striking for the fifty-year return levels, see Figure 9. Except for locations near the English channel, most points have a probability of necessary causation that is greater than $0.5$ . In particular, a few points like northern Italy shows a probability near one.

In Section E of the supplementary material (Kiriliouk and Naveau, 2020), the above approach is compared to several univariate approaches for the five-year return level. We found that, even though patterns are similar (i.e., high PN around northern Italy), a univariate analysis leads to lower PN on average and to wider confidence intervals when PN is relatively high ( $>0.5$ ). Hence, a multivariate GPD approach enhances the causality message of a univariate analysis and aids in decreasing the uncertainty of the estimates.

Finally, our analysis is not sufficient to conclude general climatological results about heavy rainfall. The patterns found here may be due to this specific climate model, internal climate variability or other sources of variability. An exhaustive analysis of all the CMIP6 models, in terms of computer resources and climatological expertise, is beyond the scope of this work.

6 Discussion

This paper illustrates that methods combining multivariate extreme-value theory and counterfactual theory could help climatologists working on causality and multivariate extremes (see, e.g. Kim et al., 2016; Zscheischler et al., 2018). An advantage of our approach is its simplicity: we consider an event to be extreme if the weighted average of a climatological random vector exceeds a high threshold, and EVT naturally suggests the multivariate GPD to model such multivariate threshold exceedances. While multivariate EVT models can take on complex parametric forms, the model we propose is easily estimated since linear projections of multivariate GPD vectors behave like a univariate GPD. When spatial dependence changes between the factual and the counterfactual world, a univariate analysis might under- or overestimate the causation probabilities, while the proposed approach will takes these changes into account. In addition, the application on heavy precipitation suggests that the multivariate approach can help in reducing the uncertainty of the estimates. Finally, the multivariate approach can highlight those grid points that maximise the probability of necessary causation through an adequate choice of the weights $\bm{w}$ .

Some care is needed when applying the multivariate methodology. First of all, the model is restricted to homogeneous regions, since it assumes an equal shape parameter. This is a common assumption in multivariate EVT models. Moreover, the multivariate GPD is tailored for data that exhibit asymptotic dependence, i.e., the extremes in each grid point are expected to occur together. Finally, the dataset under study needs to be reasonably stationary in time. In future work, a non-stationary extension of the multivariate GPD could be proposed that would be highly relevant for longer periods of data from the factual world.

Another interesting research direction will be to extend the coupling between EVT and counterfactual theory to other types of extremes modeling in geosciences; see, e.g. Hammerling, Katzfuss and Smith (2017) or Reich, Shaby and Cooley (2014) for a Bayesian hierarchical point of view, Shooter et al. (2019) for asymptotic independence models, and Ragone, Wouters and Bouchet (2018) for rare event algorithms.

{supplement}

[id=suppA] \stitleSupplement to “Climate extreme event attribution using multivariate peaks-over-thresholds modeling and counterfactual theory” \slink[doi]DOI \sdatatype.pdf \sdescriptionThe supplement includes results on the Gaussian MGPD model, probability weighted moment inference, a simulation study and further analysis for the precipitation data.

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Angélil, Stone and Wehner (2017) {barticle} [author] \bauthor \bsnm Angélil, \bfnm O. \binits O., \bauthor \bsnm Stone, \bfnm D. \binits D. and \bauthor \bsnm Wehner, \bfnm M. \binits M. ( \byear 2017). \btitle An Independent Assessment of Anthropogenic Attribution Statements for Recent Extreme Temperature and Rainfall Events. \bjournal Journal of Climate \bvolume 30 \bpages 5–16. \endbibitem
2Balkema and de Haan (1974) {barticle} [author] \bauthor \bsnm Balkema, \bfnm August A \binits A. A. and \bauthor \bparticle de \bsnm Haan, \bfnm Laurens \binits L. ( \byear 1974). \btitle Residual life time at great age. \bjournal The Annals of Probability \bvolume 2 \bpages 792–804. \endbibitem
3Beirlant et al. (2004) {bbook} [author] \bauthor \bsnm Beirlant, \bfnm J. \binits J., \bauthor \bsnm Goegebeur, \bfnm Y. \binits Y., \bauthor \bsnm Segers, \bfnm J. \binits J. and \bauthor \bsnm Teugels, \bfnm J. \binits J. ( \byear 2004). \btitle Statistics of Extremes: Theory and Applications. \bpublisher Wiley. \endbibitem
4Bernard et al. (2013) {barticle} [author] \bauthor \bsnm Bernard, \bfnm Elsa \binits E., \bauthor \bsnm Naveau, \bfnm Philippe \binits P., \bauthor \bsnm Vrac, \bfnm Mathieu \binits M. and \bauthor \bsnm Mestre, \bfnm Olivier \binits O. ( \byear 2013). \btitle Clustering of maxima: Spatial dependencies among heavy rainfall in France. \bjournal Journal of Climate \bvolume 26 \bpages 7929–7937. \endbibitem
5Carreau, Naveau and Neppel (2017) {barticle} [author] \bauthor \bsnm Carreau, \bfnm Julie \binits J., \bauthor \bsnm Naveau, \bfnm P \binits P. and \bauthor \bsnm Neppel, \bfnm L \binits L. ( \byear 2017). \btitle Partitioning into hazard subregions for regional peaks-over-threshold modeling of heavy precipitation. \bjournal Water Resources Research \bvolume 53 \bpages 4407–4426. \endbibitem
6Chen et al. (2018) {barticle} [author] \bauthor \bsnm Chen, \bfnm Yang \binits Y., \bauthor \bsnm Moufouma-Okia, \bfnm Wilfran \binits W., \bauthor \bsnm Masson-Delmotte, \bfnm Valérie \binits V., \bauthor \bsnm Zhai, \bfnm Panmao \binits P. and \bauthor \bsnm Pirani, \bfnm Anna \binits A. ( \byear 2018). \btitle Recent Progress and Emerging Topics on Weather and Climate Extremes Since the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. \bjournal Annual
7Coles (2001) {bbook} [author] \bauthor \bsnm Coles, \bfnm Stuart G. \binits S. G. ( \byear 2001). \btitle An Introduction to Statistical Modeling of Extreme Values. \bpublisher Springer-Verlag Inc. \endbibitem
8Cooley, Hunter and Smith (2017) {binbook} [author] \bauthor \bsnm Cooley, \bfnm D. \binits D., \bauthor \bsnm Hunter, \bfnm B. D. \binits B. D. and \bauthor \bsnm Smith, \bfnm R. L. \binits R. L. ( \byear 2017). \btitle Handbook of Environmental and Ecological Statistics \bchapter Univariate and Multivariate Extremes for the Environmental Sciences. \bpublisher CRC Press. \endbibitem

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Climate extreme event attribution using

Abstract

keywords:

1 Introduction

2 Background

2.1 Climate event attribution and counterfactual theory

2.2 Impact of multivariate extremal modeling on return periods

2.3 The multivariate generalized Pareto distribution

Proposition 2.1** **(Linear-projection, Rootzén, Segers and

3 Causation probabilities for univariate extremes

4 Necessary causation in a multivariate set-up

4.1 The multivariate GPD and necessary causation

4.2 Maximizing necessary causation

Proposition 4.1**.**

4.3 Inference

5 Analysis of heavy precipitation from the CNRM model

6 Discussion

Proposition 2.1 (Linear-projection, Rootzén, Segers and

Proposition 4.1.