Sensitivity Analysis and Generalized Chaos Expansions. Lower Bounds for   Sobol indices

O Roustant (Limos; Gdr Mascot-Num; Fayol-Ensmse); F. Gamboa (Imt); B; Iooss (Edf R\&D Mri; Imt; Gdr Mascot-Num)

arXiv:1906.09883·math.ST·June 25, 2019

Sensitivity Analysis and Generalized Chaos Expansions. Lower Bounds for Sobol indices

O Roustant (Limos, Gdr Mascot-Num, Fayol-Ensmse), F. Gamboa (Imt), B, Iooss (Edf R\&D Mri, Imt, Gdr Mascot-Num)

PDF

Open Access

TL;DR

This paper introduces generalized chaos expansions based on tensor Hilbert bases to estimate Sobol' sensitivity indices, providing new lower bounds that enhance variable screening in complex models.

Contribution

It develops a generalized framework for chaos expansions and derives lower bounds for Sobol' indices, improving sensitivity analysis methods.

Findings

01

Lower bounds for Sobol' indices are established.

02

Bounds are effective for variable screening.

03

Demonstrated accuracy on toy and real models.

Abstract

The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol' sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol' indices and give general lower bounds for these indices. The case of the eigenfunctions system associated with a Poincar{\'e} differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy.

Tables1

Table 1. Table 1 : Useful quantities for derivative-based lower bounds. For readability, we have removed the subscript j 𝑗 j for p , Z , I 𝑝 𝑍 𝐼 p,Z,I . The parameter s 𝑠 s is a scale parameter, and can be different from the standard deviation.

Dist. name	Support	$p$	$Z$	$I$
Normal	$ℝ$	$\frac{1}{s \sqrt{2 π}} \exp (- \frac{1}{2} \frac{{(x - m)}^{2}}{s^{2}})$	$- (X - m) / s^{2}$	$1 / s^{2}$
Laplace	$ℝ$	$\frac{1}{2 s} \exp (\frac{\| x - m \|}{s})$	$- sgn (X - m) / s$	$1 / s^{2}$
Cauchy	$ℝ$	$\frac{1}{π} \frac{s}{{(x - x_{0})}^{2} + s^{2}}$	$\frac{- 2 (x - x_{0})}{{(x - x_{0})}^{2} + s^{2}}$	$1 / (2 s^{2})$

Equations123

h (X) = h_{0} + i = 1 \sum d h_{i} (X_{i}) + 1 \leq i < j \leq d \sum h_{i, j} (X_{i}, X_{j}) + \dots + h_{1, \dots, d} (X_{1}, \dots, X_{d})

h (X) = h_{0} + i = 1 \sum d h_{i} (X_{i}) + 1 \leq i < j \leq d \sum h_{i, j} (X_{i}, X_{j}) + \dots + h_{1, \dots, d} (X_{1}, \dots, X_{d})

h_{I} (X_{I}) = E [h (X) ∣ X_{I}] - J ⊊ I \sum h_{J} (X_{J}) = J \subseteq I \sum (- 1)^{∣ I ∣ - ∣ J ∣} E [h (X) ∣ X_{J}] .

h_{I} (X_{I}) = E [h (X) ∣ X_{I}] - J ⊊ I \sum h_{J} (X_{J}) = J \subseteq I \sum (- 1)^{∣ I ∣ - ∣ J ∣} E [h (X) ∣ X_{J}] .

E [h_{I} (X_{I}) ∣ X_{J}] = 0 for all J ⊊ I,

E [h_{I} (X_{I}) ∣ X_{J}] = 0 for all J ⊊ I,

D := var (h (X)) = I \subseteq {1, \dots, d} \sum var (h_{I} (X_{I})) .

D := var (h (X)) = I \subseteq {1, \dots, d} \sum var (h_{I} (X_{I})) .

I_{d}

I_{d}

D = I \sum D_{I}, 1 = I \sum S_{I} .

D = I \sum D_{I}, 1 = I \sum S_{I} .

\mbox{$D_{I}^{\text{tot}}:=\sum_{J\supseteq\{I\}}D_{I}$}.

\mbox{$D_{I}^{\text{tot}}:=\sum_{J\supseteq\{I\}}D_{I}$}.

ν_{I} = \int (\frac{\partial ^{∣ I ∣} h ( x )}{\partial x _{I}})^{2} μ (d x) .

ν_{I} = \int (\frac{\partial ^{∣ I ∣} h ( x )}{\partial x _{I}})^{2} μ (d x) .

H = I \subseteq {1, \dots, d} \oplus ⊥ H_{I}

H = I \subseteq {1, \dots, d} \oplus ⊥ H_{I}

⟨ Π_{I} g, h ⟩ = E (g_{I} (X_{I}) h (X)) = J \subseteq {1, \dots, d} \sum E (g_{I} (X_{I}) h_{J} (X_{J}))

⟨ Π_{I} g, h ⟩ = E (g_{I} (X_{I}) h (X)) = J \subseteq {1, \dots, d} \sum E (g_{I} (X_{I}) h_{J} (X_{J}))

⟨ Π_{I} g, h ⟩ = E (g_{I} (X_{I}) h_{I} (X_{I})) = ⟨ g, Π_{I} h ⟩,

⟨ Π_{I} g, h ⟩ = E (g_{I} (X_{I}) h_{I} (X_{I})) = ⟨ g, Π_{I} h ⟩,

H_{I}^{tot} = J \supseteq I \oplus ⊥ H_{J} .

H_{I}^{tot} = J \supseteq I \oplus ⊥ H_{J} .

e_{\underline{ℓ}} (x) := (i = 1, \dots, d \otimes e_{i, ℓ_{i}}) (x) = e_{1, ℓ_{1}} (x_{1}) \times \dots \times e_{d, ℓ_{d}} (x_{d}) .

e_{\underline{ℓ}} (x) := (i = 1, \dots, d \otimes e_{i, ℓ_{i}}) (x) = e_{1, ℓ_{1}} (x_{1}) \times \dots \times e_{d, ℓ_{d}} (x_{d}) .

E [i \in I \prod e_{i, ℓ_{i}} (X_{I}) ∣ X_{J}] = j \in J \prod e_{j, ℓ_{j}} (X_{j}) i \in I ∖ J \prod E [e_{i, ℓ_{i}} (X_{I})]

E [i \in I \prod e_{i, ℓ_{i}} (X_{I}) ∣ X_{J}] = j \in J \prod e_{j, ℓ_{j}} (X_{j}) i \in I ∖ J \prod E [e_{i, ℓ_{i}} (X_{I})]

h = \underline{ℓ} \in N^{d} \sum c_{\underline{ℓ}} e_{\underline{ℓ}} = e_{\underline{ℓ}} \in T_{I} \sum c_{\underline{ℓ}} e_{\underline{ℓ}} + e_{\underline{ℓ}} \in / T_{I} \sum c_{\underline{ℓ}} e_{\underline{ℓ}}

h = \underline{ℓ} \in N^{d} \sum c_{\underline{ℓ}} e_{\underline{ℓ}} = e_{\underline{ℓ}} \in T_{I} \sum c_{\underline{ℓ}} e_{\underline{ℓ}} + e_{\underline{ℓ}} \in / T_{I} \sum c_{\underline{ℓ}} e_{\underline{ℓ}}

D_{1}^{tot} (h) \geq n = 1 \sum N (\int h (x) ϕ_{n} (x) μ (d x))^{2}

D_{1}^{tot} (h) \geq n = 1 \sum N (\int h (x) ϕ_{n} (x) μ (d x))^{2}

var_{μ_{1}} (h) \leq C \int_{R} h^{'} (x)^{2} μ_{1} (d x),

var_{μ_{1}} (h) \leq C \int_{R} h^{'} (x)^{2} μ_{1} (d x),

H^{ℓ} (μ_{1}) := {h \in L^{2} (μ_{1}) such that for all k \leq ℓ, h^{(k)} \in L^{2} (μ_{1})}

H^{ℓ} (μ_{1}) := {h \in L^{2} (μ_{1}) such that for all k \leq ℓ, h^{(k)} \in L^{2} (μ_{1})}

L h = h^{''} - V^{'} h^{'}

L h = h^{''} - V^{'} h^{'}

⟨ h^{'}, e_{n}^{'} ⟩ = λ_{n} ⟨ h, e_{n} ⟩,

⟨ h^{'}, e_{n}^{'} ⟩ = λ_{n} ⟨ h, e_{n} ⟩,

∥ h ∥^{2} = n = 1 \sum \infty ⟨ h, e_{n} ⟩^{2} = n = 1 \sum \infty \frac{1}{λ _{n}^{2}} ⟨ h^{'}, e_{n}^{'} ⟩^{2} .

∥ h ∥^{2} = n = 1 \sum \infty ⟨ h, e_{n} ⟩^{2} = n = 1 \sum \infty \frac{1}{λ _{n}^{2}} ⟨ h^{'}, e_{n}^{'} ⟩^{2} .

D_{1}^{tot} (h)

D_{1}^{tot} (h)

=

D_{1}^{tot} (h)

D_{1}^{tot} (h)

=

⟨ h, e_{1, ℓ_{1}} \dots e_{d, ℓ_{d}} ⟩ = \frac{1}{λ _{1, ℓ_{1}}} ⟨ \frac{\partial h}{\partial x _{1}}, e_{1, ℓ_{1}}^{'} e_{2, ℓ_{2}} \dots e_{d, ℓ_{d}} ⟩

⟨ h, e_{1, ℓ_{1}} \dots e_{d, ℓ_{d}} ⟩ = \frac{1}{λ _{1, ℓ_{1}}} ⟨ \frac{\partial h}{\partial x _{1}}, e_{1, ℓ_{1}}^{'} e_{2, ℓ_{2}} \dots e_{d, ℓ_{d}} ⟩

⟨ h, e_{1, ℓ_{1}} \dots e_{d, ℓ_{d}} ⟩

⟨ h, e_{1, ℓ_{1}} \dots e_{d, ℓ_{d}} ⟩

e_{ℓ} (x_{1}) = 2 cos (π ℓ (x_{1} + 1/2))

e_{ℓ} (x_{1}) = 2 cos (π ℓ (x_{1} + 1/2))

D_{1}^{tot} (h) = ℓ_{1} \geq 1, ℓ_{2}, \dots, ℓ_{d} \sum 2^{∣ \underline{ℓ} ∣_{0}} ⟨ h, i = 1 \prod d cos (π ℓ_{i} (x_{i} + 1/2)) ⟩^{2}

D_{1}^{tot} (h) = ℓ_{1} \geq 1, ℓ_{2}, \dots, ℓ_{d} \sum 2^{∣ \underline{ℓ} ∣_{0}} ⟨ h, i = 1 \prod d cos (π ℓ_{i} (x_{i} + 1/2)) ⟩^{2}

D_{1}^{tot} (h)

D_{1}^{tot} (h)

\geq

var_{μ_{1}} (h) \leq C \int_{R} h^{'} (x)^{2} w (x) μ_{1} (d x),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Structural Response to Dynamic Loads · Elasticity and Material Modeling

Full text

Sensitivity Analysis and Generalized Chaos Expansions. Lower Bounds for Sobol indices.

O. Roustant

Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS, UMR 6158 LIMOS, F–42023 Saint-Étienne, France

F. Gamboa

Institut de Mathématiques de Toulouse, Université Paul Sabatier, 31062 Toulouse Cedex 9, France

B. Iooss

Electricité de France R&D, 6 quai Watier, Chatou, F-78401, France

Institut de Mathématiques de Toulouse, Université Paul Sabatier, 31062 Toulouse Cedex 9, France

Abstract

The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol’ sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol’ indices and give general lower bounds for these indices. The case of the eigenfunctions system associated with a Poincaré differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy.

1 Introduction
2 Background on sensitivity analysis
3 Generalized chaos expansions
4 Poincaré differential operator expansions
5 Weight-free derivative global sensitivity measures
6 Examples on analytical functions
6.1 A polynomial function with interaction
6.2 A separable function
7 Applications
7.1 A simplified flood model
7.2 An aquatic prey-predator chain
7.3 Conclusion on the applications
8 Further works

1 Introduction

Computer models simulating physical phenomena and industrial systems are commonly used in engineering and safety studies. They often take as inputs a high number of numerical and physical variables. For the development and the analysis of such computer models, the global sensitivity analysis methodology is an invaluable tool that allows to rank the relative importance of each input of the system [20], [18]. Referring to a probabilistic modeling of the model input variables, it accounts for the whole input range of variation, and tries to explain output uncertainties on the basis of input uncertainties. Thanks to the so-called functional ANOVA (analysis of variance) decomposition [2], the Sobol’ indices give, for a square integrable non-linear model and stochastically independent input variables, the parts of the output variance due to each input and to each interaction between inputs [32], [15]. In addition, the total Sobol’ index provides the overall contribution of each input [16], including interactions with other inputs. More generally, we recall that a Sobol’ index associated to a subset of variables $I$ is the ratio of the ANOVA index (that is the $L^{2}$ norm of the contribution associated to $I$ in the ANOVA decomposition), and the variance of the output (see Section 2 for the precise definition).

Many methods exist to accurately compute or statistically estimate the first-order Sobol’ indices. For a general overview on these methods, we refer to [18] and references therein. One of the most popular and powerful method is polynomial chaos (PC) expansion [13], [36]. It consists in approximating the response onto the specific basis made by the orthonormal polynomials built on the input distributions. Its strength stands on the fact that, once the expansion is computed, the Parseval formula gives directly all the ANOVA indices (in particular the total Sobol’ indices) [36, 7]. Of course in practice the PC expansion is truncated. An obvious but important fact is that this truncated PC expansion provides a lower bound for the true ANOVA index. In this paper, we consider general tensor Hilbert basis called generalized chaos (GC). Further, we use the previous trick to produce general lower bounds (see Section 3). Then, a smart choice of the GC produces new interesting lower bounds involving the derivatives of the function of interest (see Section 4). More precisely, this special Hilbert basis is obtained by diagonalizing the Poincaré differential operators (PDO), associated with the input distributions (this operator is related to Poincaré inequality, see [3] or [5]). Notice that other special GC expansions based on the diagonalization of reproducing kernels has been recently studied and used for global sensitivity purposes in [29].

In general, the estimation of the total Sobol’ indices (and other ANOVA indices) suffers from the curse of dimensionality (number of inputs) and can be too costly in terms of number of model evaluations [28]. Low-cost computations of upper and lower bounds for total Sobol’ indices are then very useful. DGSM (Derivative-based Global Sensitivity Measures, see [34]), computed from some integral of the squared derivatives of the model output, may give such economical upper and lower bounds [22, 21]. Indeed, in many physical models the so-called adjoint method allows at weak extra cost the evaluation of the derivatives of the model (see for example the recent review [1]). Concerning the upper bounds, optimal and general (for any distribution type of the input) results are obtained in [30]. For lower bounds, only special cases (uniform, Normal and Gamma) have been investigated in [37, 23] (see [21] for a review). The bounds given in [23] are quite rough as they are smaller than the first-order Sobol’ indices. In our work, we follow the tracks opened by [37] using PC expansions, but for both much more general distributions and expansions. Indeed, for a wide class of input distributions the PDO generalized chaos expansion leads naturally to quantities built on the derivatives.

Notice that the diagonalization of PDO used here, leads to orthogonal polynomial only for the Gaussian distribution (see [3] and [4]). Indeed, the PDO considered in this paper only involves the integration with respect to the input distribution of the squared derivatives (and not a reweighted input distribution). Apart from this particular probability distribution, orthogonal polynomials cannot be interpreted, in general, as eigenfunctions of a PDO. Consequently, in general the Hilbert basis built by diagonalizing a PDO is not a polynomial basis. For example, for the uniform distribution, it is the Fourier basis.

The paper is structured as follows. Section 2 recalls the required mathematical tools for global sensitivity analysis (ANOVA decomposition and DGSM). Section 3 rephrases the ANOVA decomposition with Hilbert spaces, and introduces the generalized chaos expansion. Section 4 then focuses on PDO expansions, and their link to PC expansions. Section 5 gives an alternative proposition of orthonormal functions which lead to weight-free DGSM. Section 6 gives analytical examples. Section 7 illustrates on real life applications. Section 8 gives some perspectives for future works.

2 Background on sensitivity analysis

To begin with, let $X=(X_{1},\dots,X_{d})$ denotes the vector of independent input variables with distribution $\mu=\mu_{1}\otimes\dots\otimes\mu_{d}$ . Here the $\mu_{i}$ ’s are continuous probability measures on $\mathbb{R}$ . Let further $h$ be a multivariate function of interest $h:\Delta\subseteq\mathbb{R}^{d}\rightarrow\mathbb{R}$ . We assume that $h(X)\in\mathcal{H}:=L^{2}(\mu)$ .

One of the main tool in global sensitivity analysis is the Sobol’-Hoeffding decomposition of $h$ , (see [15, 11, 2, 32]). It provides a unique expansion of $h$ as

[TABLE]

with $\mathbb{E}[h_{I}(X_{I})|X_{J}]=0$ for all $I\subseteq\{1,\dots,d\}$ and all $J\subsetneq I$ (with the notation $\mbox{$ X_{I}:=(X_{i}:;i\in I) $})$ . Furthermore, $h_{0}=\mathbb{E}[h(X)]$ and

[TABLE]

Notice that the condition

[TABLE]

warrants both the uniqueness of the decomposition and the orthogonality of $h_{I}(X_{I})$ to any square integrable random variables depending only on $X_{J}$ with $J\cap I\subsetneq I$ .

This last property leads to the so-called ANOVA decomposition for the variance of $h(X)$

[TABLE]

Notice further that the Sobol’-Hoeffding decomposition is a particular case of the multivariate decomposition built on a finite family of commuting projectors $P_{1},\dots,P_{d}$ and obtained by expanding the following product (see [24]),

[TABLE]

Obviously, $\Pi_{I}$ is also a projector. In the Sobol’-Hoeffding decomposition the projection $P_{j}h$ is $\int h(x)d\mu_{j}(x_{j})$ .

In sensitivity analysis, one classically considers the Sobol’ indices. These indices are defined, for $I\subseteq\{1,\dots,d\}$ , as $S_{I}=D_{I}/D$ where $D_{I}:=\mathrm{var}(h_{I}(X_{I}))$ . From (1) one directly obtains

[TABLE]

Another interesting index is the total Sobol’ one that includes all the contributions on the total variance of a variable group. In this paper, the total index associated to one variable is the object under study. For $I\subseteq\{1,\dots,d\}$ , the total Sobol’ index associated to $I$ is defined as $S_{I}^{\text{tot}}:=\frac{D_{I}^{\text{tot}}}{D}$ with

[TABLE]

To end this section, we recall the other popular global sensitivity index that will appear in our bounds. This is the so-called Derivative Global Sensitivity Measure (DGSM) introduced and studied in [33] and [22]. It is defined, for $I\subseteq\{1,\dots,d\}$ , under smoothness and integrability assumptions on $h$ as

[TABLE]

3 Generalized chaos expansions

In order to present the generalized chaos expansions, it is convenient to first rephrase the classical functional ANOVA decomposition presented in the previous section as a Hilbert space decomposition. The next proposition is devoted to this task. In particular, we emphasize that the operator giving one ANOVA term is an orthogonal projection. Then, we discuss the construction of Hilbert basis tailored to ANOVA decomposition. Part of the material is inspired from [38] and [2].

Proposition 1 (Hilbert space decomposition for ANOVA).

For all subset $I$ of $\{1,\dots,d\}$ , the map $\Pi_{I}:h\in\mathcal{H}\mapsto h_{I}$ is an orthogonal projection. The image spaces ${\mathcal{H}_{I}=\Pi_{I}(\mathcal{H})=\{h\in\mathcal{H},\,h=h_{I}\}}$ , called ANOVA spaces, are Hilbert spaces that form an orthogonal decomposition of $\mathcal{H}$ :

[TABLE]

Proof.

First, $\Pi_{I}$ is a projector since applying twice the ANOVA decomposition leaves it unchanged. Now, let $g,h\in\mathcal{H}$ . We have:

[TABLE]

where we wrote the ANOVA decomposition of $h$ . Now, if $J\neq I$ , then $I\cap J\subsetneq I$ or $I\cap J\subsetneq J$ , thus $\mathbb{E}(g_{I}(X_{I})h_{J}(X_{J}))=0$ by the uniqueness property of ANOVA decomposition. Hence,

[TABLE]

which proves that the projector $\Pi_{I}$ is self-adjoint, and thus orthogonal.

Consequently, $\Pi_{I}$ is continuous and $\mathcal{H}_{I}$ is a Hilbert space as a closed subspace of $\mathcal{H}$ . The direct sum (2) results from the existence and uniqueness of ANOVA decomposition. As shown above, the uniqueness property implies that $\mathcal{H}_{I}\perp\mathcal{H}_{J}$ if $I\neq J$ . ∎

Corollary 1 (Hilbert space decomposition for total effects).

Let $I$ be a subset of $\{1,\dots,d\}$ . Then the map $\Pi_{I}^{\text{tot}}:h\in\mathcal{H}\mapsto h_{I}^{\text{tot}}=\sum_{J\supseteq I}h_{J}$ is an orthogonal projection. The image space ${\mathcal{H}_{I}^{\text{tot}}=\Pi_{I}^{\text{tot}}(\mathcal{H})=\{h\in\mathcal{H},\,h=h_{I}^{\text{tot}}\}}$ is the Hilbert space

[TABLE]

Proof.

Observe that $\Pi_{I}^{\text{tot}}=\sum_{J\supseteq I}\Pi_{J}$ . As the $\Pi_{J}$ are commuting orthogonal projections, $\Pi_{I}^{\text{tot}}$ is an orthogonal projection. The remainder is straightforward. ∎

We now exhibit Hilbert bases of $\mathcal{H}$ that are adapted to the ANOVA decomposition, in the sense that each element belongs to one ANOVA space $\mathcal{H}_{I}$ . This provides Hilbert bases for all $\mathcal{H}_{I}$ and $\mathcal{H}_{I}^{\text{tot}}$ .

Definition 1 (Generalized chaos).

For $i=1,\dots,d$ , let $(e_{i,n})_{n\in\mathbb{N}}$ be a Hilbert basis of $L^{2}(\mu_{i})$ , with $e_{i,0}=1$ . For a multi-index ${\underline{\ell}}=(\ell_{1},\dots,\ell_{d})\in\mathbb{N}^{d}$ , the generalized chaos of order ${\underline{\ell}}$ is defined as the following $L^{2}(\mu)$ function:

[TABLE]

The so-called polynomial chaos introduced by [39], built with the orthogonal polynomials associated to the Gaussian distribution (Hermite polynomials $(H_{n})$ ), is a special case of the previous definition (with $e_{i,n}=H_{n}$ ). Similarly, this is also the case for the generalized polynomial chaos corresponding to orthogonal polynomials associated to other probability distributions. For history on polynomial chaos and generalized polynomial chaos, we refer to the introduction of [12]. Other examples of generalized chaos in the context of sensitivity analysis are the Fourier bases, investigated in [8], and the Haar systems originally used by Sobol’ [31].

Proposition 2.

**

The whole set of generalized chaos $\mathcal{T}:=(e_{\underline{\ell}})_{{\underline{\ell}}\in\mathbb{N}^{d}}$ is a Hilbert basis of $\mathcal{H}$ , and each $e_{\underline{\ell}}$ belongs to (exactly) one $\mathcal{H}_{I}$ , where $I$ is the set containing the indices of active variables: $I=\{i\in\{1,\dots,d\}:\,\ell_{i}\geq 1\}$ . 2. 2.

For all $I\subseteq\{1,\dots,d\}$ ,

•

The subset of basis functions that involve exactly the variables in $I$ , $\mathcal{T}_{I}:=\{e_{\underline{\ell}},\,\mbox{ with }\ell_{i}\geq 1\mbox{ if }i\in I\mbox{ and }\ell_{i}=0\mbox{ if }i\notin I\}$ is a Hilbert basis of $\mathcal{H}_{I}$ .

•

The subset of basis functions that involve at least the variables in $I$ , $\mathcal{T}_{I}^{\text{tot}}:=\{e_{\underline{\ell}},\,\mbox{ with }\ell_{i}\geq 1\mbox{ if }i\in I\}$ is a Hilbert basis of $\mathcal{H}_{I}^{\text{tot}}$ .

Notice that in the definition of $\mathcal{T}_{I}$ and $\mathcal{T}_{I}^{\text{tot}}$ , the index $n_{i}$ is non zero, which means that $x_{i}$ is active.

Proof.

The fact that $\mathcal{T}$ is a Hilbert basis of $\mathcal{H}$ is well known. Let us see that $e_{\underline{\ell}}$ belongs to $\mathcal{H}_{I}$ , with $I=\{i\in\{1,\dots,d\}:\,\ell_{i}\geq 1\}$ . For that, we need to check that the ANOVA decomposition of $e_{\underline{\ell}}$ consists of only one non-zero term corresponding to the subset $I$ and equal to $e_{\underline{\ell}}$ . As $e_{\underline{\ell}}$ is a function of $x_{I}$ , it remains to check the non-overlapping condition. Let $J$ be a strict subset of $I$ (possibly empty). Then,

[TABLE]

Let us choose $i\in I\setminus J$ . Then, $\ell_{i}\geq 1$ , implying that $\mathbb{E}\left[e_{i,\ell_{i}}(X_{I})\right]=0$ (as $e_{i,\ell_{i}}$ is orthogonal to $e_{i,0}=1$ ). Finally $e_{\underline{\ell}}$ belongs to $\mathcal{H}_{I}$ . Now let us fix a subset $I$ of $\{1,\dots,d\}$ , and consider for instance $\mathcal{T}_{I}$ (the proof is similar for $\mathcal{T}_{I}^{\text{tot}}$ ). Clearly, as a subset of $\mathcal{T}$ , the set $\mathcal{T}_{I}$ is a collection of orthonormal functions. Furthermore, by the proof above, each $e_{\underline{\ell}}$ of $\mathcal{T}_{I}$ belongs to $\mathcal{H}_{I}$ . To see that $\mathcal{T}_{I}$ is dense in $\mathcal{H}_{I}$ , let us choose $h\in\mathcal{H}_{I}$ . Since $\mathcal{T}$ is a Hilbert basis of $\mathcal{H}$ , then $h$ can be written as

[TABLE]

where $(c_{\underline{\ell}})_{{\underline{\ell}}\in\mathbb{N}^{d}}$ is a squared integrable sequence of real numbers. Recall that each $e_{\underline{\ell}}$ belongs to $\mathcal{H}_{J}$ , with $J=\{i\in\{1,\dots,d\}\,s.t.\,\ell_{i}\geq 1\}$ . Thus, if $e_{\underline{\ell}}\notin\mathcal{T}_{I}$ , then $J\neq I$ . Hence, $e_{\underline{\ell}}\in\mathcal{H}_{I}^{\perp}$ (as $\mathcal{H}_{J}\perp\mathcal{H}_{I}$ ). Since $h\in\mathcal{H}_{I}$ , it implies that $\sum_{e_{\underline{\ell}}\notin\mathcal{T}_{I}}c_{\underline{\ell}}e_{\underline{\ell}}=0$ . ∎

The previous results imply that the variance $D_{I}$ (resp. $D_{I}^{\text{tot}}$ ) of the output explained by a set $I$ (resp. supersets of $I$ ) of input variables, is equal to the squared norm of the orthogonal projection onto $\mathcal{H}_{I}$ (resp. $\mathcal{H}_{I}^{\text{tot}}$ ). Hence, lower bounds can be obtained by projecting onto smaller subspaces.

Corollary 2.

Let $I$ be a subset of $\{1,\dots,d\}$ and let $h\in\mathcal{H}$ . Then:

•

For all subset $G$ of $\mathcal{H}_{I}$ , $D_{I}=\|\Pi_{I}(h)\|^{2}\geq\|\Pi_{G}(h)\|^{2}$ , with equality iff $h$ has the form $h=g+f$ with $g\in G$ and $f\in\mathcal{H}_{I}^{\perp}$

•

For all subset $G$ of $\mathcal{H}_{I}^{\text{tot}}$ , $D_{I}^{\text{tot}}=\|\Pi_{I}^{\text{tot}}(h)\|^{2}\geq\|\Pi_{G}(h)\|^{2}$ , with equality iff $h$ has the form $h=g+f$ with $g\in G$ and $f\in(\mathcal{H}_{I}^{\text{tot}})^{\perp}$

In practice, the subset $G$ on which to project may be finite dimensional. For instance, it can be chosen by picking a finite number of orthonormal functions from the Hilbert basis obtained in Proposition 2. We illustrate this on the common case where $I$ correspond to a single variable. Without loss of generality, we assume that $I=\{1\}$ .

Corollary 3.

Let $\phi_{1},\dots,\phi_{N}$ be orthonormal functions in $\mathcal{H}_{1}^{\text{tot}}$ . Then:

[TABLE]

with equality iff $h$ has the form $h(x)=\sum_{n=1}^{N}\alpha_{n}\phi_{n}(x)+g(x_{2},\dots,x_{N})$ , where $g\in L^{2}(\underset{i=2,\dots,d}{\otimes}\mu_{i})$ . Furthermore, if all the $\phi_{j}$ ’s belong to $\mathcal{H}_{1}$ , then the lower bound holds for $D_{1}$ .

Proof.

This is a direct application of Corollary 2 with $G=\textrm{span}\{\phi_{1},\dots,\phi_{m}\}$ . The equality case is obtained by remarking that $(\mathcal{H}_{1}^{\text{tot}})^{\perp}$ is formed by functions of $\mathcal{H}$ that do not involve $x_{1}$ : $(\mathcal{H}_{1}^{\text{tot}})^{\perp}=\underset{J\subseteq\{2,\dots,d\}}{\oplus}\mathcal{H}_{J}$ . ∎

4 Poincaré differential operator expansions

Generalized chaos expansions are defined from $d$ Hilbert bases associated to probability measures on the real line $\mu_{i}$ ( $i=1,\dots,d)$ . Here, each $\mu_{i}$ is assumed to be absolutely continuous with respect to the Lebesgue measure. In this section, we exhibit a class of Hilbert basis which is well tailored to perform sensitivity analysis based on derivatives. They consist of eigenfunctions of an elliptic differential operator (DO). More precisely, we choose the DO associated to a 1-dimensional Poincaré inequality (assuming it holds)

[TABLE]

as it was successfully used to obtain accurate bounds for DGSM [30].

Before defining the so-called PDO expansions, we first recall the spectral theorem related to Poincaré inequalities. In what follows, for any positive integer $\ell$ , we denote by $H^{\ell}(\mu_{1})$ the Sobolev space of order $\ell$ :

[TABLE]

Proposition 3 (Spectral theorem for Poincaré inequalities, [3, 30]).

Let $\mu_{1}(dt)=\rho(t)dt$ be a continuous measure on a bounded interval $I=(a,b)$ of $\mathbb{R}$ , where $\rho(t)=e^{-V(t)}$ . Assume that $V$ is continuous and piecewise $C^{1}$ on $\bar{I}=[a,b]$ . Then consider the differential operator

[TABLE]

*defined on $\mathcal{H}^{\prime}=\{h\in H^{2}(\mu_{1})\text{ s.t. }h^{\prime}(a)=h^{\prime}(b)=0\}$ . Then $L$ admits a spectral decomposition. That is, there exists an increasing sequence $(\lambda_{n})_{n\geq 0}$ of non-negative values that tends to infinity, and a set of orthonormal functions $e_{n}$ which form a Hilbert basis of $L^{2}(\mu_{1})$ such that $Le_{n}=-\lambda_{n}e_{n}$ . Furthermore, all the eigenvalues $\lambda_{n}$ are simple. The first eigenvalue is $\lambda_{0}=0$ , and the corresponding eigenspace consists of constant functions (we can choose $e_{0}=1$ ). The first positive eigenvalue $\lambda_{1}$ is called spectral gap, and equal to the inverse of the Poincaré constant $C_{\textrm{P}}(\mu_{1})$ , i.e. the smallest constant satisfying Inequality (4).

Remark 1.

*The assumptions of Proposition 3 guarantee that $L$ admits a spectral decomposition, and correspond to a continuous probability distribution defined on a compact support, whose density is continuous and does not vanish. However, the spectral decomposition can exist for more general cases. For instance, it exists for the Normal distribution on $\mathbb{R}$ : the corresponding eigenfunctions consist of Hermite polynomials and eigenvalues to non-negative integers. On the other hand, the spectral decomposition does not exist for the Laplace (double-exponential) distribution on the whole $\mathbb{R}$ .

The key property in our context is given by the equation

[TABLE]

corresponding to the weak formulation of the spectral problem $Le_{n}=-\lambda_{n}e_{n}$ associated to the Poincaré inequality, and holding for all $n\geq 0$ , and all $h\in H^{1}(\mu_{1})$ . It implies that geometric quantities involved in PDO expansions can be rewritten with derivatives. In particular, for a centered function $h$ , we have:

[TABLE]

Let us come back to the $d$ -dimensional situation, where $\mu=\underset{i=1,\dots,d}{\otimes}\mu_{i}$ . For each measure $\mu_{i}$ , we make the assumptions of Proposition 3 (see also Remark 1 for alternative conditions). We denote by $L_{i}$ the corresponding operator and $\lambda_{i,n},e_{i,n}$ ( $n\geq 0$ ) its eigenvalues and eigenfunctions. We define $H^{1}(\mu)$ similarly to $H^{1}(\mu_{1})$ (Equation 5). We can now define the PDO expansion and then state the main result.

Definition 2 (PDO expansions).

We call Poincaré differential operator (PDO) expansion the generalized chaos expansion corresponding to the Hilbert bases formed by the eigenfunctions of $L_{1},\dots,L_{d}$ .

Proposition 4 (Poincaré-based lower bounds).

For all $h$ in $H^{1}(\mu)$ , we have

[TABLE]

In particular, limiting ourselves to the first eigenfunction in all dimensions, and to first and second order tensors involving $x_{1}$ , we obtain the lower bound

[TABLE]

Proof.

By Proposition 2, the subset of ( $e_{\underline{\ell}}$ ) corresponding to $\ell_{1}\geq 1$ is a Hilbert basis of $\mathcal{H}_{1}^{\text{tot}}$ . This gives (8). Now, for $\ell_{1}\geq 1$ :

[TABLE]

This is obtained by applying Eq. (7) to $x_{1}\mapsto h(x)$ and integrating with respect to $x_{2},\dots,x_{d}$ :

[TABLE]

This gives (9). The remainder is straightforward, knowing that ${C_{\textrm{P}}(\mu_{1})=1/\lambda_{1,1}}$ . ∎

Case of uniform distributions: Fourier expansion.

Let us assume that $\mu_{1}$ is uniform on $[-1/2,1/2]$ . Then, the differential operator $L$ is the usual Laplacian, and its eigenfunctions correspond to Fourier basis. More precisely, using the Neumann boundary conditions $h^{\prime}(a)=h^{\prime}(b)=0$ , one can check that the eigenvalues are $\lambda_{\ell}=\ell^{2}\pi^{2}$ , $(\ell=0,1,\dots)$ , and a set of orthonormal eigenfunctions is given by $e_{0}=1$ and

[TABLE]

for $\ell>0$ . Denote by $|\underline{\ell}|_{0}$ the number of non-zero coefficients of the multi-index $\underline{\ell}=(\ell_{1},\dots,\ell_{d})$ . When the other $\mu_{i}$ ’s are also uniform on $[-1/2,1/2]$ , we obtain a multivariate Parseval formula for $D_{1}^{\text{tot}}$ :

[TABLE]

Limiting for instance the sum to first terms, we obtain the lower bounds

[TABLE]

Extension of PDO expansions to weighted Poincaré inequalities.

PDO expansions correspond to diffusion operators associated to Poincaré inequalities. They can be extended to weighted Poincaré inequalities

[TABLE]

defined for some suitable positive weight $w$ . Such inequalities have recently been used in sensitivity analysis [35]. They are also useful when a probability distribution does not admit a Poincaré inequality such as the Cauchy distribution [5]. The weighted Poincaré inequality (14) corresponds to the differential operator

[TABLE]

Similarly to (7), rewriting geometrical quantities with derivatives can be done with the formula:

[TABLE]

where $\langle.,.\rangle_{w}$ is the weighted dot product $\langle f,g\rangle_{w}:=\int f(x)g(x)w(x)\mu(dx)$ . Proposition 4 can be adapted accordingly.

When PDO expansions coincide with PC expansions.

There are exactly three cases where PDO expansions coincide with PC expansions, even when considering their extension to weighted Poincaré inequalities. Indeed, it can be shown that orthogonal polynomials are eigenfunctions of diffusion operators only for the Normal, Gamma and Beta distributions, corresponding respectively to Hermite, Laguerre and Jacobi orthogonal polynomials ([3], § 2.7). These differential operators correspond to weighted Poincaré inequalities with weight $w(x)=x$ for the Gamma distribution $d\mu_{1}(x)\propto x^{\alpha-1}e^{-\alpha x}$ on $\mathbb{R}^{+}$ , and weight $w(x)=1-x^{2}$ for the Beta distribution $d\mu_{1}(x)\propto(1-x)^{\alpha-1}(1+x)^{\beta-1}$ on $[-1,1]$ . Notice that in [35], $w$ is chosen such that the eigenfunction associated to $\lambda_{1}$ is a first-order polynomial. Except for the three cases mentioned above, the other eigenfunctions cannot be all polynomials.

5 Weight-free derivative global sensitivity measures

The lower bounds of total indices obtained with generalized chaos expansions may involve weighted DGSM. For instance, in PDO expansions, weights involve the eigenfunction derivatives (Equation (11)). The presence of weight can be a drawback when the integral has to be estimated with a small sample size, as it can increase the variance of the Monte Carlo estimator. In this section, we show how to choose the two first orthonormal functions of GC expansions in order to obtain weight-free DGSM. Interestingly, this is related to Fisher information and Cramér-Rao bounds.

Proposition 5 (Lower bounds with weight-free DGSM, for pdf vanishing at the boundaries).

Assume that $\frac{\partial h(x)}{\partial x_{1}}$ is in $L^{2}(\mu)$ , and that the probability distributions $\mu_{i}$ are absolutely continuous on their support $(a_{i},b_{i})$ with $-\infty\leq a_{i}<b_{i}\leq+\infty$ . For each $i$ , denote by $p_{i}$ the corresponding probability density function. Assume that $p_{i}$ belongs to $H^{1}(\mu_{i})$ , do not vanish on $(a_{i},b_{i})$ but vanishes at the boundaries: $p_{i}(a_{i})=p_{i}(b_{i})=0$ . Finally, assume that $p_{i}^{\prime}$ is not identically zero, and that $p_{i}^{\prime}/p_{i}$ is in $L^{2}(\mu_{i})$ . Define $Z_{i}(x_{i})=(\ln p_{i})^{\prime}(x_{i})$ and $I_{i}=\mathrm{var}(Z_{i}(X_{i}))$ . Then, we have the inequality:

[TABLE]

with

[TABLE]

Furthermore, if all the cross derivatives $\frac{\partial^{2}h(x)}{\partial x_{1}\partial x_{j}}$ are in $L^{2}(\mu)$ , then

[TABLE]

The cases of equality correspond to functions $h$ of the form

[TABLE]

Proof.

For $i=1,\dots,d$ , let $e_{i,1}(x_{i}):=I_{i}^{-1/2}Z_{i}(x_{i})$ . Then define

[TABLE]

By definition, the norm of each $e_{i,1}$ is equal to $1$ . Furthermore, $Z_{i}$ is centered, since

[TABLE]

This implies that $e_{i,1}$ is orthogonal to $e_{i,0}=1$ . By Proposition 2, the $\phi_{i}$ ’s are then orthonormal functions of $\mathcal{H}_{1}^{\text{tot}}$ . The inequality is then given by Corollary 3, with first expressions of $c_{1}$ and $c_{1,j}$ . The other ones are obtained by integrating by part, using that the values at the boundaries of the $p_{j}$ ’s are zero. ∎

The proposition can be adapted when the probability density functions do not vanish at the boundaries of their support, by modifying the definition of the $Z_{j}$ ’s. Notice that the expressions of $c_{1}$ and $c_{1,j}$ that involve derivatives then contain corrective terms, and are of limited practical interest. For instance, denoting $\left[h\right]_{a_{1}}^{b_{1}}=h(b_{1})-h(a_{1})$ and $h_{0}=\int h(x)\mu(dx)$ , we have:

[TABLE]

Nevertheless, the first expressions of $c_{1}$ and $c_{1,j}$ remain valid and, by analogy to Proposition 5, have a close connection to derivative-based lower bounds.

Proposition 6 ([Lower bounds with weight-free DGSM, general case).

Assume that $\frac{\partial h(x)}{\partial x_{1}}$ is in $L^{2}(\mu)$ , and that the probability distributions $\mu_{i}$ are absolutely continuous on their support $(a_{i},b_{i})$ with $-\infty\leq a_{i}<b_{i}\leq+\infty$ . For each $i$ , denote by $p_{i}$ the corresponding probability density function. Assume that $p_{i}$ belongs to $H^{1}(\mu_{i})$ and do not vanish on $(a_{i},b_{i})$ . Finally, assume that $p_{i}^{\prime}$ is not identically zero, and that $p_{i}^{\prime}/p_{i}$ is in $L^{2}(\mu_{i})$ . Define $Z_{i}(x_{i})=(\ln p_{i})^{\prime}(x_{i})-\left[p_{i}(x_{i})\right]_{a_{i}}^{b_{i}}$ and $I_{i}=\mathrm{var}(Z_{i}(X_{i}))$ . Then Inequality (17) holds with $c_{1}=\int h(x)Z_{1}(x_{1})\mu(dx)$ and $c_{1,j}=\int h(x)Z_{1}(x_{1})Z_{j}(x_{j})\mu(dx)$ . The equality case is the same as in Proposition 5, and given by (18).

Remark 1.

The expressions of $Z_{i}$ and $I_{i}$ in Proposition 6 correspond respectively to the score and to the Fisher information at $\bm{\theta}=0$ of a parametric family of probability distributions obtained by translation $p_{i,\theta_{i}}(x_{i})=p_{i}(x_{i}+\theta_{i})$ . In this framework, the lower bound (17) corresponds to the Cramér-Rao lower bound.

Examples.

First consider the case of normal distributions $\mu_{i}\sim\mathcal{N}(m_{i},v_{i})$ ( $i=1,\dots,d$ ). Applying Inequality (17) gives

[TABLE]

Here, the inequality is equivalent to Inequality (11) obtained with the Poincaré differential operator of Section 4, since $Z_{i}$ is a first-order polynomial, and thus equal to the first eigenvector of $L$ (Hermite polynomial). The case of equality corresponds to functions of the form

[TABLE]

Other inequalities can be established for standard probability distributions. Table 1 summarizes the results for some of them. Notice that the equality case does not always correspond to polynomials (see the form of $Z$ ). Interestingly, an inequality is obtained for the Cauchy distribution, whereas the theory of Section 4 does not apply as this distribution does not admit a Poincaré constant. On the other hand, some probability distributions for which Section 4 is applicable, do not satisfy the assumptions of Proposition 6, such as the uniform ( $p^{\prime}_{i}$ is identically zero) or the triangular distributions ( $p_{i}^{\prime}/p_{i}$ does not belong to $L^{2}(\mu_{i})$ ).

Link to other works.

Here, we briefly compare our lower bounds to those presented in the recent review [21].

For the uniform distribution on $[0,1]$ , we can obtain both a better upper bound and a description of the equality case. For that, we apply Corollary 3 to the orthonormal function obtained from $x_{1}^{m}$ , i.e. $\phi(x_{1})=(x_{1}^{m}-m_{1})/s_{1}$ with $m_{1}=1/(m+1)$ and $s_{1}^{2}=\left(\frac{m}{m+1}\right)^{2}\frac{1}{2m+1}$ . Then after some algebra and an integration by part, we obtain

[TABLE]

where $w_{1}^{(m+1)}=\int\frac{\partial h(x)}{\partial x_{1}}x_{1}^{m+1}dx$ . This improves on the lower bound found in [21], Theorem 2, which has the same form, but with the smaller multiplicative constant $\frac{2m+1}{(m+1)^{2}}$ . Furthermore, the lower bound above is attained when $h$ has the form $h(x)=\alpha_{1}x_{1}^{m}+g(x_{2},\dots,x_{d})$ . However, notice that these two lower bounds are only a lower bound for $D_{1}\leq D_{1}^{\textrm{tot}}$ , and can be improved by considering additional orthonormal functions belonging to $\mathcal{H}_{1}^{\text{tot}}\setminus\mathcal{H}_{1}$ .

For normal distributions, Inequality (19) improves the lower bound given by [23], i.e.

[TABLE]

Here also, this latter lower bound is only a lower bound of $D_{1}\leq D_{1}^{\textrm{tot}}$ since it corresponds to the case in Corollary 3 where the $\phi_{j}$ ’s (here $\phi_{1}(x)=Z_{1}(x_{1})/I_{1}^{1/2}$ ) only depend on $x_{1}$ .

6 Examples on analytical functions

This section briefly illustrates PDO expansions for the uniform distribution on benchmark functions from sensitivity analysis. We assess the accuracy of the lower bounds of total indices, when only the two first eigenvalues are used.

6.1 A polynomial function with interaction

Example 1.

Let us consider $g(x_{1},x_{2})=x_{1}+ax_{1}x_{2}$ , and let $\mu$ be the uniform distribution on $[-1/2,1/2]^{2}$ . The inequalities obtained by truncating the PDO expansions to the first eigenvalue are:

[TABLE]

We can see that for a polynomial function of degree $1$ with respect to $x_{1}$ , the lower bound obtained by restricting the PDO expansion to the first eigenvalue is very accurate. Hence, we do not loose a lot of information by ignoring that the function is a polynomial. This is an ideal situation for polynomial chaos.

Let us give some computing details on the previous inequalities. It is easy to check that the two terms $x_{1}$ , $ax_{1}x_{2}$ correspond to the main effect and second order interaction respectively. The partial variances are given by $D_{1}=1/12$ and $D_{1,2}=a^{2}/144$ . Hence, $D_{1}^{\text{tot}}=1/12+a^{2}/144$ . Restricting the PDO expansion to the first term, a lower bound is given by Inequality (13):

[TABLE]

The two terms of the lower bound above correspond to a lower bound of $D_{1}$ and $D_{1,2}$ respectively. A direct computation gives:

[TABLE]

The result follows.

6.2 A separable function

Example 2.

Consider the g-Sobol’ function on $[-1/2,1/2]$ defined by

[TABLE]

with $h_{i}(x_{i})=(4|x_{i}|-1)/(1+a_{i})$ ( $i=1,\dots,d$ ), and let $\mu$ be the uniform distribution on $[-1/2,1/2]^{d}$ . The inequalities obtained by truncating the PDO expansions to the first two eigenvalues are:

[TABLE]

Notice that $32/\pi^{4}\approx 0.328$ is very close to $1/3$ . Hence, the lower bound for $D_{i}$ is very accurate. Obviously, a very sharp inequality $D_{i}^{\text{tot}}\geq\textrm{LB}_{i}\prod_{i\neq j}^{d}(1+\textrm{LB}_{j})$ could have been deduced, but this is unrealistic in practice, since the separable form of the function is unknown. The lower bound (21) for $D_{i}^{\text{tot}}$ is actually a very good approximation of the variance explained by second-order interactions involving $x_{i}$ , equal to $D_{i}.\sum_{j\neq i}^{d}D_{j}$ . Hence, Inequality (21) will be less fine in presence of higher order interactions, (tuned by the values of the $a_{j}$ ’s). Then, more than two eigenvalues in PDO expansions must be considered.

Let us give some computing details on the previous inequalities. Without loss of generality, we write the proof for $i=1$ . Let us first recall the computation of Sobol’ indices for the g-Sobol’ function. As all the $h_{i}$ are centered, the Sobol’-Hoeffding decomposition is given by $g_{I}(x_{I})=\prod_{i\in I}h_{i}(x_{i})$ . In particular $D_{1}=\int h_{1}^{2}d\mu_{1}=\frac{1}{3}\frac{1}{(1+a_{1})^{2}}$ . Furthemore, the variance of a second order interaction is, for $i\neq 1$ :

[TABLE]

and variance explained by second-order interactions containing $x_{1}$ is equal to

[TABLE]

Finally the total effect is the variance of $\sum_{I\supseteq\{1\}}\prod_{i\in I}h_{i}$ , equal to

[TABLE]

Let us now consider lower bounds. To obtain accurate lower bounds, we need to consider the first two non-zero eigenvalues. Indeed, the first non-zero eigenvector is even and all the dot products are 0. By using Equation (9) and the results about uniform distributions presented in Section 4, we obtain:

[TABLE]

with $e_{i,2}=\sqrt{2}\cos(2\pi x_{i})$ (we omit the ’-’ sign) and $\lambda_{2}=4\pi^{2}$ . We could have also used (8), but using derivatives simplifies the computations here.

The first term gives a lower bound for $D_{1}$ . We have:

[TABLE]

Due to the tensor form of the g-Sobol’ function partial derivative, the dot product is expressed as a product of one-dimensional dot-products. Furthermore, as all the $h_{i}^{\prime}s$ are centered, the dot-products in dimensions $2,\dots,d$ are equal to 1. Finally,

[TABLE]

This gives the announced lower bound for the main effect (Equation (21)):

[TABLE]

Now, let us compute the second term in (23). Notice that it is a lower bound for the variance explained by second-order interactions involving $x_{1}$ , as computed in (22). As above, exploiting the tensor form, we have:

[TABLE]

The first term has already been computed above. For the second one, we use the property of eigenvectors (7):

[TABLE]

and we recognize the quantity computed above where we replace $1$ by $i$ , equal to ${\displaystyle\frac{1}{\lambda_{2}}\frac{16\sqrt{2}}{1+a_{i}}}=\sqrt{\textrm{LB}_{i}}$ . Finally, plugging this result in (23) together with (24) gives the announced lower bound (21).

7 Applications

In this section, two numerical models representing real physical phenomena are used in order to illustrate the usefulness of the lower bounds of total Sobol’ indices provided by PDO expansions. More precisely, we restrict ourselves to the simplest lower bound provided by considering only the first eigenfunctions in all dimensions, given by the two equivalent Equations (10) and (11). The first equation gives a derivative-free lower bound of the total index, here called PDO lower bound. The second one gives a derivative-based version, here called PDO-der lower bound.

Whereas the PDO and PDO-der lower bounds are theoretically equal, their estimated values will differ. Estimations of integrals and square products have been performed via crude Monte Carlo samples. We have centered the function $f$ . It does not change the value of sensitivity indices but reduces the estimation error. The use of Monte Carlo samples allows to provide confidence intervals on the estimates by the way of a bootstrap resampling technique. Boxplots will be used to graphically represent these estimation uncertainties. Finally, the computation of eigenvalues, eigenfunctions and eigenfunction derivatives has been done with the numerical method presented in [30].

7.1 A simplified flood model

Our first model simulates flooding events by comparing the height of a river to the height of a dyke. It involves the characteristics of the river stretch, as already studied in [25, 30]. The model has $8$ input random variables (r.v.), each one follows a specific probability distribution (truncated Gumbel, truncated normal, triangular or uniform). When the height of a river is over the height of the dyke, flooding occurs. The model output is the cost (in million euros) of the damage on the dyke which writes:

[TABLE]

where $\hbox{1\kern-2.40005pt\hbox{I}}_{A}(x)$ is the indicator function which is equal to 1 for $x\in A$ and 0 otherwise, $H_{d}$ is the height of the dyke (uniform r.v.) and $S$ is the maximal annual overflow (in meters) based on a crude simplification of the 1D hydro-dynamical equations of Saint-Venant under the assumptions of uniform and constant flowrate and large rectangular section. $S$ is calculated as

[TABLE]

with $Q$ the maximal annual flowrate (truncated Gumbel r.v.), $K_{s}$ the Strickler coefficient (truncated Gaussian r.v.), $Z_{m}$ and $Z_{v}$ the upstream and downstream riverbed levels (triangular r.v.), $L$ and $B$ the length and width of the water section (triangular r.v.) and $C_{b}$ the bank level (triangular r.v.). For this model, first-order and total Sobol’ indices have been estimated in [25] with high precision (large sample size) via a Monte-Carlo based algorithms.

Fig. 1 shows the PDO lower bounds. By looking at the values of first-order and total Sobol’ indices (horizontal straight lines), we notice that rather large interaction effects are present between four inputs of the model ( $Q$ , $K_{s}$ , $Z_{v}$ and $H_{d}$ ). First, the bounds estimated with the sample size $n=100$ have large uncertainties. It shows that this sample size is too small for this complex model (it includes non-linear and interaction effects). Secondly, concerning the estimation of the bounds, the convergence is reached, with very small uncertainties on the estimates from $n=10\,000.$ From this sample size, we can visually check (e.g. looking at the third quartile) that estimated lower bounds are smaller than the corresponding true Sobol’ indices. Moreover, for smaller sample sizes as $n=1\,000$ , results for all the inputs show sufficient accuracies (easy discrimination between the bounds). Finally, except for $K_{s}$ and $H_{d}$ , the bounds are informative because:

•

The PDO bounds are very close to the theoretical values of total Sobol’ indices, which is remarkable as only the first eigenvalue was used.

•

The PDO lower bounds for total indices are larger than their respective first-order Sobol’ indices.

Fig. 2 shows that the PDO-der lower bounds give significantly better results than the PDO bounds, especially for small sample sizes. In particular, when the Sobol’ indices are close to zero, the bounds perfectly match their respective Sobol’ indices from $n=100$ . This result clearly favors the use of derivative-based lower bounds for the screening step when model derivatives can be computed.

7.2 An aquatic prey-predator chain

This application is related to the modeling of an aquatic ecosystem called MELODY (MESocosm structure and functioning for representing LOtic DYnamic ecosystems). This model simulates the functioning of aquatic mesocosms as well as the impact of toxic substances on the dynamics of their populations. Inside this model, the Periphyton-Grazers sub-model is representative of processes involved in dynamics of primary producers and primary consumers, i.e. photosynthesis, excretion, respiration, egestion, mortality, sloughing and predation [6]. It contains a total number of $d=20$ uncertain input variables. In order to conduct sensitivity analysis, [6] has defined that each of these input variables are random following a uniform distribution law, defined by their minimal and maximal values.

The PDO-der upper bound of total Sobol’ indices [34] was then applied in [19] on one model output (the periphyton biomass) at only one reference time, day $60$ of simulations, which corresponds to the period of maximum periphyton biomass and a growth phase for grazers, according to experimental data. A design of experiments of size $n=100$ was then provided, and simulated with MELODY. A model output vector of size $100$ is obtained, as well as the derivatives of the output with respect to each input at each point of the design (matrix of size $100\times 20$ ). In this section, we analyze the same data that has been studied in [19].

Fig. 3 shows the PDO lower bounds, as well as the first-order Sobol’ indices estimates (via the local polynomials sample based technique [10]). Good results are obtained on the first-order lower bounds which have reduced estimation uncertainties and are always smaller than the estimated first-order Sobol’ indices. Less accurate estimates are obtained for the lower bounds of total indices. They remain informative because they are clearly larger than the first-order Sobol’ indices. This last result proves that large interactions between inputs dominate in this prey-predator model, which confirms the first analysis of [19] (the sum of all the first-order Sobol’ indices is much smaller than one). The new results of Fig. 3 prove the strong influence of some inputs which have large total lower bounds. For example, $5$ inputs have total lower bound median values larger than $20\%$ : Maximum photosynthesis rate (n∘1), Maximum consumption rate (n∘2), Rate of change per $10^{\circ}$ C (n∘9), Grazers preference for periphyton (n∘11) and Intrinsic mortality rate (n∘16). This result cannot be found from the first-order Sobol’ indices which are rather small (except for the Maximum photosynthesis rate).

Fig. 4 shows the PDO-der lower bounds, as well as the PDO-der upper bounds of the total Sobol’ indices (see [19]) whose confidence intervals are also obtained by bootstrap. In this figure lower and upper bounds of total Sobol’ indices have been truncated to one in order to only consider realistic values. Indeed, values larger than one are theoretically impossible but can sometimes be found due to numerical estimation errors. First, some partial checks can be done by looking at the median of the estimated values, e.g. by observing that the lower bounds are smaller than the upper bounds for each input. Second, several PDO-der lower bounds estimates are much less accurate than the (derivative-free) PDO lower bounds, especially when their values are large, for example the inputs n∘1 and n∘11. Even in this case of large values, informative results can be deduced by taking their median values: the total Sobol’ indices of the input n∘1 (resp. n∘11) approximately lie in $[0.85,1]$ (resp. $[0.4,1]$ ). From these total lower bounds, a coarse importance hierarchy can then be proposed between the most influential inputs.

Finally, we observe the excellent results for non influential inputs which have all their PDO-der lower and upper bounds close to zero (inputs 2, 5, 7, 8, 10, 13, 15, 18, 19, 20). This is not the case with the PDO lower bounds (see Fig. 3) which are more difficult to exploit. A convenient usage would be to estimate both derivative-free and derivative-based lower bounds, and to keep the smallest value. Indeed, the PDO bound is more accurate when the Sobol’ index is much larger than zero, whereas the PDO-der bound is much smaller when the Sobol’ index is close to zero.

7.3 Conclusion on the applications

On the two previous applications, we have tested the simplest PDO and PDO-der lower bounds, obtained by keeping only the first eigenvalue in all dimensions, for real-world models involving non-linear and interaction effects. Several conclusions can be made:

•

Lower bounds can be easily computed for any probability distribution of the inputs;

•

The estimation error can be large for small sample sizes. Estimating some boostrap confidence intervals is essential to evaluate the quality of the estimates;

•

The lower bounds of the total Sobol’ indices are most of the times informative, i.e. larger than the (estimated) first order Sobol’ indices;

•

Using derivatives (then DGSM) is sometimes preferable to obtain lower bounds, especially for the screening step (identification of non influential inputs with negligible total Sobol’ indices). With DGSM, excellent results are obtained for screening, even for small sample size cases.

8 Further works

In this paper, we revisit the so-called chaos expansion method for the evaluation of Sobol’ indices. We summarize in a compact way the role played by the functional basis and the associated projection operators for evaluating by below these indices through a truncated Parseval formula. Generalized chaos basis built on the Poincaré diferential operator associated to the input distribution leads to very interesting new lower bounds for the total Sobol’ index in terms of DGSM. This bound appears to be sharp both on toy and real life models, allowing a fast screening of the model input based on the energy of the function derivatives. This opens some challenging problems in mathematical statistics. First, the bounds obtained by the brute force truncation method could certainly been merely improved considering accurate model selection methods as adaptive thresholding or $l^{1}$ regularization. Second, the statistical estimation of the lower bound is a non linear semi-parametric problem. By non linear, we mean that the quantity to be estimated depends in a non linear way (here quadratic), of the infinite dimensional parameter (the function of interest). The estimation of a quadratic functional have been addressed in [26, 27, 14, 9]. It involves $U$ -statistics theory, and offers an excellent source of inspiration for further works in mathematical statistics having concrete computational applications. For example, the unbiased estimation of such quantity for small sample appears to be an interesting challenging issue. As ending remark, notice that the use of PDO also opens challenging questions concerning the construction of such operators (and eigenbasis). First, one may be interested to build a PDO that provides a lower bound involving weighted DGSM. Secondly, one may wish to consider the case of heavy tail input distributions (as the Cauchy one for example).

Software and acknowledgement

The implementations are partially based on the R package sensitivity [17]. The whole code should be included in a future version of that package.

Part of this research was conducted within the frame of the Chair in Applied Mathematics OQUAIDO, gathering partners in technological research (BRGM, CEA, IFPEN, IRSN, Safran, Storengy) and academia (CNRS, Ecole Centrale de Lyon, Mines Saint-Etienne, University of Grenoble, University of Nice, University of Toulouse) around advanced methods for Computer Experiments. The authors thank the participants for fruitful discussions. In particular we are grateful to A. Joulin for the insightful idea of using the Poincaré differential operator for computing lower bounds. Support from the ANR-3IA Artificial and Natural Intelligence Toulouse Institute is gratefully acknowledged.

Bibliography39

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. Allaire. A review of adjoint methods for sensitivity analysis, uncertainty quantification and optimization in numerical codes. Ingénieurs de l’Automobile , 836:33–36, 2015.
2[2] A. Antoniadis. Analysis of variance on function spaces. Statistics: A Journal of Theoretical and Applied Statistics , 15(1):59–71, 1984.
3[3] D. Bakry, I. Gentil, and M. Ledoux. Analysis and geometry of Markov diffusion operators, volume 348 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] . Springer, Cham, 2014.
4[4] D. Bakry and O. Mazet. Characterization of markov semigroups on ℝ associated to some families of orthogonal polynomials. In Séminaire de Probabilités XXXVII , pages 60–80. Springer, 2003.
5[5] M. Bonnefont, A. Joulin, and Y. Ma. A note on spectral gap and weighted Poincaré inequalities for some one-dimensional diffusions. ESAIM: Probability and Statistics , 20:18–29, 2016.
6[6] C. Ciric, P. Ciffroy, and S. Charles. Use of sensitivity analysis to identify influential and non-influential parameters within an aquatic ecosystem model. Ecological Modelling , 246:119–130, 2012.
7[7] T. Crestaux, O. L. Maître, and J.-M. Martinez. Polynomial chaos expansions for uncertainties quantification and sensitivity analysis. Reliability Engineering and System Safety , 94:1161–1172, 2009.
8[8] R. Cukier, H. Levine, and K. Shuler. Nonlinear sensitivity analysis of multiparameter model systems. Journal of computational physics , 26(1):1–42, 1978.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Sensitivity Analysis and Generalized Chaos Expansions. Lower Bounds for Sobol indices.

Abstract

Contents

1 Introduction

2 Background on sensitivity analysis

3 Generalized chaos expansions

Proposition 1** (Hilbert space decomposition for ANOVA).**

Proof.

Corollary 1** (Hilbert space decomposition for total effects).**

Proof.

Definition 1** (Generalized chaos).**

Proposition 2**.**

Proof.

Corollary 2**.**

Corollary 3**.**

Proof.

4 Poincaré differential operator expansions

Proposition 3** (Spectral theorem for Poincaré inequalities, [3, 30]).**

Remark 1**.**

Definition 2** (PDO expansions).**

Proposition 4** (Poincaré-based lower bounds).**

Proof.

Case of uniform distributions: Fourier expansion.

Extension of PDO expansions to weighted Poincaré inequalities.

When PDO expansions coincide with PC expansions.

5 Weight-free derivative global sensitivity measures

Proposition 5** (Lower bounds with weight-free DGSM, for pdf vanishing at the boundaries).**

Proof.

Proposition 6** ([Lower bounds with weight-free DGSM, general case).**

Remark 1**.**

Examples.

Link to other works.

6 Examples on analytical functions

6.1 A polynomial function with interaction

Example 1**.**

6.2 A separable function

Example 2**.**

7 Applications

7.1 A simplified flood model

7.2 An aquatic prey-predator chain

7.3 Conclusion on the applications

8 Further works

Software and acknowledgement

Proposition 1 (Hilbert space decomposition for ANOVA).

Corollary 1 (Hilbert space decomposition for total effects).

Definition 1 (Generalized chaos).

Proposition 2.

Corollary 2.

Corollary 3.

Proposition 3 (Spectral theorem for Poincaré inequalities, [3, 30]).

Remark 1.

Definition 2 (PDO expansions).

Proposition 4 (Poincaré-based lower bounds).

Proposition 5 (Lower bounds with weight-free DGSM, for pdf vanishing at the boundaries).

Proposition 6 ([Lower bounds with weight-free DGSM, general case).

Remark 1.

Example 1.

Example 2.