Efficient design of experiments for sensitivity analysis based on   polynomial chaos expansions

E. Burnaev; I. Panin; B. Sudret

arXiv:1705.03944·stat.CO·May 12, 2017·Ann. Math. Artif. Intell.

Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions

E. Burnaev, I. Panin, B. Sudret

PDF

Open Access

TL;DR

This paper introduces an adaptive experimental design method based on polynomial chaos expansions and D-optimality for efficient estimation of Sobol' sensitivity indices in global sensitivity analysis.

Contribution

It proposes a novel adaptive design approach for selecting experimental points to accurately estimate Sobol' indices using polynomial chaos expansions.

Findings

01

The method improves efficiency in sensitivity analysis.

02

Applications demonstrate the effectiveness of the proposed approach.

03

The approach reduces computational costs for Sobol' indices estimation.

Abstract

Global sensitivity analysis aims at quantifying respective effects of input random variables (or combinations thereof) onto variance of a physical or mathematical model response. Among the abundant literature on sensitivity measures, Sobol' indices have received much attention since they provide accurate information for most of models. We consider a problem of experimental design points selection for Sobol' indices estimation. Based on the concept of $D$ -optimality, we propose a method for constructing an adaptive design of experiments, effective for calculation of Sobol' indices based on Polynomial Chaos Expansions. We provide a set of applications that demonstrate the efficiency of the proposed approach.

Figures15

Click any figure to enlarge with its caption.

Tables2

Table 1. Table 1 : Benchmark settings for analytical functions

Characteristic	Sobol	Ishigami	Environmental	Borehole	WingWeight
Input dimension	$3$	$3$	$4$	$8$	$10$
Input distributions	Unif	Unif	Unif	Unif	Unif
PCE degree	$9$	$9$	$5$	$4$	$4$
$q$ -norm	$0.75$	$0.75$	$1$	$0.75$	$0.75$
Regressors number	$111$	$111$	$126$	$117$	$176$
Initial design size	$150$	$120$	$126$	$117$	$186$
Added noise std	(0, 0.2, 1.4)	—	$0.5$	( $0$ , $5.0$ )	( $0$ , $5.0$ )

Table 2. Table 2 : Benchmark settings for Finite Element models

Characteristic	Truss	Heat transfer
Input dimension	$10$	$53$
Input distributions	Unif	Norm
PCE degree	$4$	$2$
$q$ -norm	$0.75$	$0.75$
Regressors number	$176$	$107$
Initial design size	$176$	$108$
Added noise std	—	—

Equations98

L = {x_{i}, y_{i} = f (x_{i})}_{i = 1}^{n} ≜ {X \in R^{n \times d}, Y = f (X) \in R^{n}},

L = {x_{i}, y_{i} = f (x_{i})}_{i = 1}^{n} ≜ {X \in R^{n \times d}, Y = f (X) \in R^{n}},

f (X) = f_{0} + i = 1 \sum d f_{i} (X_{i}) + 1 \leq i \leq j \leq d \sum f_{ij} (X_{i}, X_{j}) + \dots + f_{1 \dots d} (X_{1}, \dots, X_{d}),

f (X) = f_{0} + i = 1 \sum d f_{i} (X_{i}) + 1 \leq i \leq j \leq d \sum f_{ij} (X_{i}, X_{j}) + \dots + f_{1 \dots d} (X_{1}, \dots, X_{d}),

E [f_{u} (X_{u}) f_{v} (X_{v})] = 0, if u \neq = v,

E [f_{u} (X_{u}) f_{v} (X_{v})] = 0, if u \neq = v,

D = V [f (X)] = u \subset {1, \dots, d}, u \neq = 0 \sum V [f_{u} (X_{u})] = u \subset {1, \dots, d}, u \neq = 0 \sum D_{u},

D = V [f (X)] = u \subset {1, \dots, d}, u \neq = 0 \sum V [f_{u} (X_{u})] = u \subset {1, \dots, d}, u \neq = 0 \sum D_{u},

S_{u} = \frac{D _{u}}{D} .

S_{u} = \frac{D _{u}}{D} .

Ψ_{α} (X) = i = 1 \prod d ψ_{α_{i}}^{(i)} (X_{i}), α = {α_{i} \in N, i = 1, \dots, d} \in L,

Ψ_{α} (X) = i = 1 \prod d ψ_{α_{i}}^{(i)} (X_{i}), α = {α_{i} \in N, i = 1, \dots, d} \in L,

E [Ψ_{α} (X) Ψ_{β} (X)] = 0 \mbox i f α \neq = β .

E [Ψ_{α} (X) Ψ_{β} (X)] = 0 \mbox i f α \neq = β .

f (X) = α \in N^{d} \sum c_{α} Ψ_{α} (X),

f (X) = α \in N^{d} \sum c_{α} Ψ_{α} (X),

\hat{Y} = f_{P C} (X) = α \in L \sum c_{α} Ψ_{α} (X) .

\hat{Y} = f_{P C} (X) = α \in L \sum c_{α} Ψ_{α} (X) .

\hat{Y} = f_{P C} (X) = α \in L \sum c_{α} Ψ_{α} (X) ≜ j = 0 \sum P - 1 c_{j} Ψ_{j} (X) = c^{T} Ψ (X), P ≜ ∣ L ∣,

\hat{Y} = f_{P C} (X) = α \in L \sum c_{α} Ψ_{α} (X) ≜ j = 0 \sum P - 1 c_{j} Ψ_{j} (X) = c^{T} Ψ (X), P ≜ ∣ L ∣,

c_{j = 0} ≜ c_{α = 0}, Ψ_{j = 0} ≜ Ψ_{α = 0} = co n s t .

c_{j = 0} ≜ c_{α = 0}, Ψ_{j = 0} ≜ Ψ_{α = 0} = co n s t .

L = {α \in N^{d} : ∥ α ∥_{q} \leq p}, ∥ α ∥_{q} ≜ (i = 1 \sum d α_{i}^{q})^{1/ q},

L = {α \in N^{d} : ∥ α ∥_{q} \leq p}, ∥ α ∥_{q} ≜ (i = 1 \sum d α_{i}^{q})^{1/ q},

f (X) = f_{P C} (X) + ε = j = 0 \sum P - 1 c_{j} Ψ_{j} (X) + ε = c^{T} Ψ (X) + ε,

f (X) = f_{P C} (X) + ε = j = 0 \sum P - 1 c_{j} Ψ_{j} (X) + ε = c^{T} Ψ (X) + ε,

c = ar g c \in R^{P} min E [(f (X) - c^{T} Ψ (X))^{2}],

c = ar g c \in R^{P} min E [(f (X) - c^{T} Ψ (X))^{2}],

\hat{c}_{L S} = ar g c \in R^{P} min \frac{1}{n} i = 1 \sum n [y_{i} - c^{T} Ψ (x_{i})]^{2} .

\hat{c}_{L S} = ar g c \in R^{P} min \frac{1}{n} i = 1 \sum n [y_{i} - c^{T} Ψ (x_{i})]^{2} .

S_{i} (c) = \frac{\sum _{α \in L_{i}} c _{α}^{2} E [ Ψ _{α}^{2} ( X )]}{\sum _{α \in L_{*}} c _{α}^{2} E [ Ψ _{α}^{2} ( X )]}, i = 1, \dots, d,

S_{i} (c) = \frac{\sum _{α \in L_{i}} c _{α}^{2} E [ Ψ _{α}^{2} ( X )]}{\sum _{α \in L_{*}} c _{α}^{2} E [ Ψ _{α}^{2} ( X )]}, i = 1, \dots, d,

E [Ψ_{α} (X) Ψ_{β} (X)] = δ_{α β},

E [Ψ_{α} (X) Ψ_{β} (X)] = δ_{α β},

S_{i} (c) = \frac{\sum _{α \in L_{i}} c _{α}^{2}}{\sum _{α \in L_{*}} c _{α}^{2}}, i = 1, \dots, d .

S_{i} (c) = \frac{\sum _{α \in L_{i}} c _{α}^{2}}{\sum _{α \in L_{*}} c _{α}^{2}}, i = 1, \dots, d .

\hat{S}_{i} = S_{i} (\hat{c}) = \frac{\sum _{α \in L_{i}} c ^ _{α}^{2}}{\sum _{α \in L_{*}} c ^ _{α}^{2}}, i = 1, \dots, d,

\hat{S}_{i} = S_{i} (\hat{c}) = \frac{\sum _{α \in L_{i}} c ^ _{α}^{2}}{\sum _{α \in L_{*}} c ^ _{α}^{2}}, i = 1, \dots, d,

A_{n} = i = 1 \sum n Ψ (x_{i}) Ψ^{T} (x_{i}) .

A_{n} = i = 1 \sum n Ψ (x_{i}) Ψ^{T} (x_{i}) .

\frac{1}{n} A_{n} = \frac{1}{n} i = 1 \sum n Ψ (x_{i}) Ψ^{T} (x_{i}) n \to + \infty ⟶ Σ,

\frac{1}{n} A_{n} = \frac{1}{n} i = 1 \sum n Ψ (x_{i}) Ψ^{T} (x_{i}) n \to + \infty ⟶ Σ,

S (ν) = (S_{1} (ν), \dots, S_{d} (ν))^{T}

S (ν) = (S_{1} (ν), \dots, S_{d} (ν))^{T}

det (B Σ^{- 2} Γ B^{T}) \neq = 0,

det (B Σ^{- 2} Γ B^{T}) \neq = 0,

B ≜ B (c) = \frac{\partial S ( ν )}{\partial ν}_{ν = c} \in R^{d \times P},

B ≜ B (c) = \frac{\partial S ( ν )}{\partial ν}_{ν = c} \in R^{d \times P},

n (S (\hat{c}_{n}) - S (c)) n \to + \infty ⟶ D N (0, B Σ^{- 2} Γ B^{T}) .

n (S (\hat{c}_{n}) - S (c)) n \to + \infty ⟶ D N (0, B Σ^{- 2} Γ B^{T}) .

\hat{c}_{n} = A_{n}^{- 1} Ψ_{n} Y_{n} = c + (\frac{1}{n} A_{n})^{- 1} [\frac{1}{n} Ψ_{n} ε_{n}] .

\hat{c}_{n} = A_{n}^{- 1} Ψ_{n} Y_{n} = c + (\frac{1}{n} A_{n})^{- 1} [\frac{1}{n} Ψ_{n} ε_{n}] .

n (\hat{c}_{n} - c) = (\frac{1}{n} A_{n})^{- 1} [\frac{1}{n} i = 1 \sum n ξ_{i}] n \to + \infty ⟶ D N (0, Σ^{- 2} Γ) .

n (\hat{c}_{n} - c) = (\frac{1}{n} A_{n})^{- 1} [\frac{1}{n} i = 1 \sum n ξ_{i}] n \to + \infty ⟶ D N (0, Σ^{- 2} Γ) .

b_{i β} ≜ \frac{\partial S _{i}}{\partial c _{β}} = ⎩ ⎨ ⎧ \frac{2 c _{β} \sum _{α \in L_{*}} c _{α}^{2} - 2 c _{β} \sum _{α \in L_{i}} c _{α}^{2}}{( \sum _{α \in L_{*}} c _{α}^{2} ) ^{2}}, 0, \frac{- 2 c _{β} \sum _{α \in L_{i}} c _{α}^{2}}{( \sum _{α \in L_{*}} c _{α}^{2} ) ^{2}}, if β \in L_{i}, if β = 0 ≜ {0, \dots, 0}, if β \in / L_{i} \cup 0,

b_{i β} ≜ \frac{\partial S _{i}}{\partial c _{β}} = ⎩ ⎨ ⎧ \frac{2 c _{β} \sum _{α \in L_{*}} c _{α}^{2} - 2 c _{β} \sum _{α \in L_{i}} c _{α}^{2}}{( \sum _{α \in L_{*}} c _{α}^{2} ) ^{2}}, 0, \frac{- 2 c _{β} \sum _{α \in L_{i}} c _{α}^{2}}{( \sum _{α \in L_{*}} c _{α}^{2} ) ^{2}}, if β \in L_{i}, if β = 0 ≜ {0, \dots, 0}, if β \in / L_{i} \cup 0,

b_{i β} ≜ \frac{\partial S _{i}}{\partial c _{β}} = \frac{- 2 c _{β}}{\sum _{α \in L_{*}} c _{α}^{2}} \times ⎩ ⎨ ⎧ S_{i} - 1, 0, S_{i}, if β \in L_{i}, if β = 0 ≜ {0, \dots, 0}, if β \in / L_{i} \cup 0,

b_{i β} ≜ \frac{\partial S _{i}}{\partial c _{β}} = \frac{- 2 c _{β}}{\sum _{α \in L_{*}} c _{α}^{2}} \times ⎩ ⎨ ⎧ S_{i} - 1, 0, S_{i}, if β \in L_{i}, if β = 0 ≜ {0, \dots, 0}, if β \in / L_{i} \cup 0,

n (S (\hat{c}_{n}) - S (c)) n \to + \infty ⟶ D N (0, σ^{2} B Σ^{- 1} B^{T}) .

n (S (\hat{c}_{n}) - S (c)) n \to + \infty ⟶ D N (0, σ^{2} B Σ^{- 1} B^{T}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Structural Response to Dynamic Loads · Rice Cultivation and Yield Improvement

Full text

Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions

E. Burnaev

Skolkovo Institute of Science and Technology, Building 3, Nobel st., Moscow 143026, Russia

Kharkevich Institute for Information Transmission Problems, Bolshoy Karetny per. 19, Moscow 127994, Russia

National Research University Higher School of Economics

Myasnitskaya st. 20, Moscow 109028, Russia

I. Panin

Kharkevich Institute for Information Transmission Problems, Bolshoy Karetny per. 19, Moscow 127994, Russia

B. Sudret

Chair of Risk, Safety and Uncertainty Quantification, ETH Zurich, Stefano-Franscini-Platz 5, 8093 Zurich, Switzerland

Abstract

Global sensitivity analysis aims at quantifying respective effects of input random variables (or combinations thereof) onto variance of a physical or mathematical model response. Among the abundant literature on sensitivity measures, Sobol indices have received much attention since they provide accurate information for most of models. We consider a problem of experimental design points selection for Sobol’ indices estimation. Based on the concept of $D$ -optimality, we propose a method for constructing an adaptive design of experiments, effective for calculation of Sobol’ indices based on Polynomial Chaos Expansions. We provide a set of applications that demonstrate the efficiency of the proposed approach.

Keywords: Design of Experiment – Sensitivity Analysis – Sobol Indices – Polynomial Chaos Expansions – Active Learning

1 Introduction

Computational models play important role in different areas of human activity (see [1, 2, 3]). Over the past decades, computational models have become more complex, and there is an increasing need for special methods for their analysis. Sensitivity analysis is an important tool for investigation of computational models.

Sensitivity analysis tries to find how different model input parameters influence the model output, what are the most influential parameters and how to evaluate such effects quantitatively (see [4]). Sensitivity analysis allows to better understand behavior of computational models. Particularly, it allows us to separate all input parameters into important (significant), relatively important and unimportant (nonsignificant) ones. Important parameters, i.e. parameters whose variability has a strong effect on the model output, need to be controlled more accurately. Complex computational models often suffer from over-parameterization. By excluding unimportant parameters, we can potentially improve model quality, reduce parametrization (which is of great interest in the field of meta-modeling) and computational costs [29].

Sensitivity analysis includes a wide range of metrics and techniques: e.g. the Morris method [5], linear regression-based methods [6], variance-based methods [7]. Among others, Sobol’ (sensitivity) indices are a common metric to evaluate the influence of model parameters [11]. Sobol’ indices quantify which portions of the output variance are explained by different input parameters and combinations thereof. This method is especially useful for the case of nonlinear computational models [12].

There are two main approaches to evaluate Sobol’ indices. Monte Carlo approach (Monte Carlo simulations, FAST [13], SPF scheme [14] and others) is relatively robust (see [8]), but requires large number of model runs, typically in the order of $10^{4}$ for an accurate estimation of each index. Thus, it is impractical for a number of industrial applications, where each model evaluation is computationally costly.

Metamodeling approaches for Sobol’ indices estimation allow one to reduce the required number of model runs [6, 10]. Following this approach, we replace the original computational model by an approximating metamodel (also known as surrogate model or response surface) which is computationally efficient and has some clear internal structure [9]. The approach consists of the following general steps: selection of the design of experiments (DoE) and generation of the training sample, construction of the metamodel based on the training sample, including its accuracy assessment and evaluation of Sobol’ indices (or any other measure) using the constructed metamodel. Note that the evaluation of indices may be either based on a known internal structure of the metamodel or via Monte Carlo simulations based on the metamodel itself.

In general, the metamodeling approach is more computationally efficient than an original Monte Carlo approach, since the cost (in terms of the number of runs of the costly computational model) reduces to that of the training set (usually containing results from a few dozens to a few hundreds model runs). However, this approach can be nonrobust and its accuracy is more difficult to analyze. Indeed, although procedures like cross-validation [29, 15] allow to estimate quality of metamodels, the accuracy of complex statistics (e.g. Sobol’ indices), derived from metamodels, has a complicated dependency on the metamodels structure and quality (see e.g. confidence intervals for Sobol’ indices estimates [16] in case of Gaussian Process metamodel [17, 18, 19, 20, 21] and bootstrap-based confidence intervals in case of polynomial chaos expansions [22]).

In this paper, we consider a problem of a DoE construction in case of a particular metamodeling approach: how to select the experimental design for building a polynomial chaos expansion for further evaluation of Sobol’ indices, that is effective in terms of the number of computational model runs?

Space-filling designs are commonly used for sensitivity analysis. Methods like Monte Carlo sampling, Latin Hypercube Sampling (LHS) [23] or sampling in FAST method [13] try to fill “uniformly” the input parameters space with design points (points are some realizations of parameters values). These sampling methods are model free, as they make no assumptions on the computational model.

In order to speed up the convergence of indices estimates, we assume that the computational model is close to its approximating metamodel and exploit knowledge of the metamodel structure. In this paper, we consider Polynomial Chaos Expansions (PCE) that is commonly used in engineering and other applications [24]. PCE approximation is based on a series of polynomials (Hermite, Legendre, Laguerre etc.) that are orthogonal w.r.t. the probability distributions of corresponding input parameters of the computational model. It allows to calculate Sobol’ indices analytically from the expansion coefficients [25, 26].

In this paper, we address the problem of design of experiments construction for evaluating Sobol’ indices from a PCE metamodel. Based on asymptotic considerations, we propose an adaptive algorithm for design construction and test it on a set of applied problems. Note that in [36], we investigated the adaptive design algorithm for the case of a quadratic metamodel (see also [37]). In this paper, we extend these results for the case of a generalized PCE metamodel and provide more examples, including real industrial applications.

The paper is organized as follows: in Section 2, we review the definition of sensitivity indices and describe their estimation based on a PCE metamodel. In Section 3, asymptotic analysis of indices estimates is provided. In Section 4, we introduce an optimality criterion and propose a procedure for constructing the experimental design. In Section 5, we provide experimental results, applications and benchmark with other methods of design construction.

2 Sensitivity Indices and PCE Metamodel

2.1 Sensitivity Indices

Consider a computational model $y=f(\mathbf{x})$ , where $\mathbf{x}=(x_{1},\ldots,x_{d})\in\mathscr{X}\subset\mathbb{R}^{d}$ is a vector of input variables (aka parameters or features), $y\in\mathbb{R}^{1}$ is an output variable and $\mathscr{X}$ is a design space. The model $f(\mathbf{x})$ describes behavior of some physical system of interest.

We consider the model $f(\mathbf{x})$ as a black-box: no additional knowledge on its inner structure is assumed. For some design of experiments $X=\{\mathbf{x}_{i}\in\mathscr{X}\}_{i=1}^{n}\in\mathbb{R}^{n\times d}$ we can obtain a set of model responses and form a training sample

[TABLE]

which allows us to investigate properties of the computational model.

Let us assume that there is a prescribed probability distribution $\mathscr{H}$ with independent marginal distributions on the design space $\mathscr{X}$ ( $\mathscr{H}=\mathscr{H}_{1}\times\ldots\times\mathscr{H}_{d}$ ). This distribution represents the uncertainty and/or variability of the input variables, modelled as a random vector $\vec{X}=\{X_{1},\ldots,X_{d}\}$ with independent components. In these settings, the model output $\vec{Y}=f(\vec{X})$ becomes a stochastic variable.

Assuming that the function $f(\vec{X})$ is square-integrable with respect to the distribution $\mathscr{H}$ (i.e. $\mathbb{E}[f^{2}(\vec{X})]<+\infty$ ), we have the following unique Sobol’ decomposition of $\vec{Y}=f(\vec{X})$ (see [11]) given by

[TABLE]

which satisfies

[TABLE]

where $\mathbf{u}$ and $\mathbf{v}$ are index sets: $\mathbf{u},\mathbf{v}\subset\{1,2,\ldots,d\}$ .

Due to orthogonality of the summands, we can decompose variance of the model output:

[TABLE]

In this expansion $D_{\mathbf{u}}\triangleq\mathbb{V}[f_{\mathbf{u}}(\vec{X}_{\mathbf{u}})]$ is the contribution of the summand $f_{\mathbf{u}}(\vec{X}_{\mathbf{u}})$ to the output variance, also known as the partial variance.

Definition 1.

The sensitivity index (Sobol’ index) of the subset $\vec{X}_{\mathbf{u}},\;\mathbf{u}\subset\{1,\ldots,d\}$ of model input variables is defined as

[TABLE]

The sensitivity index describes the amount of the total variance explained by uncertainties in the subset $\vec{X}_{\mathbf{u}}$ of model input variables.

Remark 1.

In this paper, we consider only sensitivity indices of type $S_{i}\triangleq S_{\{i\}},i=1,\ldots,d$ , called first-order or main effect sensitivity indices.

2.2 Polynomial Chaos Expansions

Consider a set of multivariate polynomials $\{\Psi_{\boldsymbol{\alpha}}(\vec{X}),\;\boldsymbol{\alpha}\in\mathscr{L}\}$ that consists of polynomials $\Psi_{\boldsymbol{\alpha}}$ having the form of tensor product

[TABLE]

where $\psi_{\alpha_{i}}^{(i)}$ is a univariate polynomial of degree $\alpha_{i}$ belonging to the $i$ -th family (e.g. Legendre polynomials, Jacobi polynomials, etc.), $\mathbb{N}=\{0,1,2,\ldots\}$ is the set of nonnegative integers, $\mathscr{L}$ is some fixed set of multi-indices $\boldsymbol{\alpha}$ .

Suppose that univariate polynomials $\{\psi_{\alpha}^{(i)}\}$ are orthogonal w.r.t. $i$ -th marginal of the probability distribution $\mathscr{H}$ , i.e. $\mathbb{E}[\psi_{\alpha}^{(i)}(X_{i})\psi_{\beta}^{(i)}(X_{i})]=0$ if $\alpha\neq\beta$ for $i=1,\ldots,d$ . Particularly, Legendre polynomials are orthogonal w.r.t. standard uniform distribution; Hermite polynomials are orthogonal w.r.t. Gaussian distribution. Due to independence of components of $\vec{X}$ , we obtain that multivariate polynomials $\{\Psi_{\boldsymbol{\alpha}}\}$ are orthogonal w.r.t. the probability distribution $\mathscr{H}$ , i.e.

[TABLE]

Provided $\mathbb{E}[f^{2}(\vec{X})]<+\infty$ , the spectral polynomial chaos expansion of $f$ takes the form

[TABLE]

where $\{c_{\boldsymbol{\alpha}}\}_{\boldsymbol{\alpha}\in\mathbb{N}^{d}}$ are expansion coefficients.

In the sequel we consider a PCE approximation $f_{PC}(\vec{X})$ of the model $f(\vec{X})$ obtained by truncating the infinite series to a finite number of terms:

[TABLE]

By enumerating the elements of $\mathscr{L}$ we also use an alternative form of (4):

[TABLE]

where $\mathbf{c}=(c_{0},\ldots,c_{P-1})^{T}$ is a column vector of coefficients and $\boldsymbol{\Psi}(\mathbf{x})\colon\mathbb{R}^{d}\to\mathbb{R}^{P}$ is a mapping from the design space to the extended design space defined as a column vector function $\boldsymbol{\Psi}(\mathbf{x})=\left(\Psi_{0}(\mathbf{x}),\ldots,\Psi_{P-1}(\mathbf{x})\right)^{T}$ . Note that index $j=0$ corresponds to multi-index $\boldsymbol{\alpha}=\mathbf{0}=\{0,\ldots,0\}$ , i.e.

[TABLE]

The set of multi-indices $\mathscr{L}$ is determined by some truncation scheme. In this work, we use hyperbolic truncation scheme [27], which corresponds to

[TABLE]

where $q\in(0,1]$ is a fixed parameter and $p\in\mathbb{N}\backslash\{0\}=\{1,2,3,\ldots\}$ is a fixed maximal total degree of polynomials. Note that in case of $q=1$ , we have $P=\frac{(d+p)!}{d!p!}$ polynomials in $\mathscr{L}$ and a smaller $q$ leads to a smaller number of polynomials.

There is a number of strategies for estimating the expansion coefficients $c_{\boldsymbol{\alpha}}$ in (4). In this paper, the least-square (LS) minimization method is used [28]. Unlike (3), the key idea consists in considering the original model $f(\vec{X})$ as the sum of a truncated PC expansion $f_{PC}(\vec{X})$ and a residual $\varepsilon$ , i.e.

[TABLE]

where thanks to orthogonality property (2) the residual process $\varepsilon$ can be considered as an i.i.d. noise process with $\mathbb{E}\varepsilon=0$ and $\mathbb{V}[\varepsilon]=\sigma^{2}$ , such that $\varepsilon=\varepsilon(\vec{X})$ and $\{\Psi_{j}(\vec{X})\}_{j=0}^{P-1}$ are orthogonal w.r.t. the distribution $\mathscr{H}$ .

The coefficients $\mathbf{c}$ are obtained by minimizing the mean square residual:

[TABLE]

which is approximated by using the training sample $L=\{\mathbf{x}_{i},y_{i}=f(\mathbf{x}_{i})\}_{i=1}^{n}$ :

[TABLE]

2.3 PCE post-processing for sensitivity analysis

Consider some PCE model $f_{PC}(\vec{X})=\sum_{\boldsymbol{\alpha}\in\mathscr{L}}c_{\boldsymbol{\alpha}}\Psi_{\boldsymbol{\alpha}}(\vec{X})=\sum_{j=0}^{P-1}c_{j}\Psi_{j}(\vec{X})$ . According to [25], we have an explicit form of Sobol’ indices (main effects) for model $f_{PC}(\vec{X})$ :

[TABLE]

where $\mathscr{L}_{*}\triangleq\mathscr{L}\backslash\{\mathbf{0}\}$ and $\mathscr{L}_{i}\subset\mathscr{L}$ is the set of multi-indices $\boldsymbol{\alpha}$ such that only index on the $i$ -th position is nonzero: $\boldsymbol{\alpha}=\{0,\ldots,\alpha_{i},\ldots,0\}$ , $\alpha_{i}\in\mathbb{N}$ , $\alpha_{i}>0$ .

Suppose for simplicity that the multivariate polynomials $\{\Psi_{\boldsymbol{\alpha}}(\vec{X}),\;\boldsymbol{\alpha}\in\mathscr{L}\}$ are not only orthogonal but also normalized w.r.t. the distribution $\mathscr{H}$ :

[TABLE]

where $\delta_{\boldsymbol{\alpha}\boldsymbol{\beta}}$ is the Kronecker symbol, i.e $\delta_{\boldsymbol{\alpha}\boldsymbol{\beta}}=1$ if $\boldsymbol{\alpha}=\boldsymbol{\beta}$ , otherwise $\delta_{\boldsymbol{\alpha}\boldsymbol{\beta}}=0$ . Then (7) takes the form

[TABLE]

Thus, (8) provides a simple expression for calculation of Sobol’ indices in case of the PCE metamodel. If the original model of interest $f(\vec{X})$ is close to its PCE approximation $f_{PC}(\vec{X})$ , then we can use expression (8) for indices with estimated coefficients (6) to approximate Sobol’ indices of the original model:

[TABLE]

where $\hat{\mathbf{c}}\triangleq\hat{\mathbf{c}}_{LS}$ .

3 Asymptotic Properties

In this section, we consider asymptotic properties of indices estimates in Eq. (9) if the coefficients $\mathbf{c}$ are estimated by LS approach (6). Let $\hat{\mathbf{c}}_{n}$ be LS estimate (6) of the true coefficients vector $\mathbf{c}$ based on the training sample $L=\{\mathbf{x}_{i},y_{i}=f(\mathbf{x}_{i})\}_{i=1}^{n}$ . In this section and further, if some variable has index $n$ , then this variable depends on training sample (1) of size $n$ .

Define the information matrix $A_{n}\in\mathbb{R}^{P\times P}$ as

[TABLE]

Then, we can obtain asymptotic properties of the indices estimates (9) based on model (5) while new data points $\{\mathbf{x}_{n},y_{n}=f(\mathbf{x}_{n})\}$ are added to the training sample sequentially. In order to prove these asymptotic properties we require only that $\varepsilon=\varepsilon(\vec{X})$ and $\{\Psi_{j}(\vec{X})\}_{j=0}^{P-1}$ are orthogonal w.r.t. the distribution $\mathscr{H}$ , and we do not need to require that multivariate polynomials $\{\Psi_{\boldsymbol{\alpha}}(\vec{X}),\;\boldsymbol{\alpha}\in\mathscr{L}\}$ are orthonormal.

Theorem 1.

Let the following assumptions hold true:

We assume that there is an infinite sequence of points in the design space $\{\mathbf{x}_{i}\in\mathscr{X}\}_{i=1}^{\infty}$ , generated by the corresponding sequence of i.i.d. random vectors, such that a.s.

[TABLE]

where $\Sigma\in\mathbb{R}^{P\times P}$ , where $\Sigma$ is a symmetric and non-degenerate matrix ( $\Sigma=\Sigma^{T}$ * and $det\Sigma>0$ ), and new design points are added successively from this sequence to the design of experiments $X_{n}=\{\mathbf{x}_{i}\}_{i=1}^{n}$ .* 2. 2.

Let the vector-function be defined by its components according to (8):

[TABLE]

and $\hat{\mathbf{S}}_{n}\triangleq\mathbf{S}(\hat{\mathbf{c}}_{n})$ , where $\hat{\mathbf{c}}_{n}$ is defined by (6). 3. 3.

Assume that for the true coefficients $\mathbf{c}$ of model (5):

[TABLE]

where $B$ is the matrix of partial derivatives defined as

[TABLE]

and $\Gamma=(\gamma_{r,s})_{r,s=0}^{P-1}\in\mathbb{R}^{P\times P}$ with $\gamma_{r,s}=\mathbb{E}\left(\varepsilon^{2}\Psi_{r}(\vec{X})\Psi_{s}(\vec{X})\right)$ ,

then

[TABLE]

Proof.

Let us denote by $\boldsymbol{\varepsilon}_{n}=(\varepsilon_{1},\ldots,\varepsilon_{n})^{T}\in\mathbb{R}^{n}$ the column vector, generated by the i.i.d. residual process values (see (5)), and by $\boldsymbol{\Psi}_{n}=(\boldsymbol{\Psi}(\mathbf{x}_{1}),\ldots,\boldsymbol{\Psi}(\mathbf{x}_{n}))\in\mathbb{R}^{P\times n}$ the design matrix. We can easily get that

[TABLE]

We can represent $\frac{1}{n}\boldsymbol{\Psi}_{n}\boldsymbol{\varepsilon}_{n}$ as $\frac{1}{n}\sum_{i=1}^{n}\boldsymbol{\xi}_{i}$ , where $(\boldsymbol{\xi}_{i})_{i=1}^{n}$ is a sequence, generated by i.i.d. random vectors $\boldsymbol{\xi}_{i}=\varepsilon(\vec{X}_{i})\boldsymbol{\Psi}(\vec{X}_{i})\in\mathbb{R}^{P}$ , $i=1,\ldots,n$ , such that $\mathbb{E}\boldsymbol{\xi}_{i}=0$ thanks to the fact that $\varepsilon$ and $\Psi_{k}(\vec{X})$ are orthogonal for $k<P$ , and $\mathbb{V}[\boldsymbol{\xi}_{i}]=\Gamma$ .

Thus from (11) and the central limit theorem we get that

[TABLE]

Applying $\delta$ -method (see [30]) to the vector-function $\mathbf{S}(\boldsymbol{\nu})$ at the point $\boldsymbol{\nu}=\mathbf{c}$ , we obtain required asymptotics (14). ∎

Remark 2.

Note that the elements of $B$ have the following form

[TABLE]

where $i=1,\ldots,d$ and multi-index $\boldsymbol{\beta}\in\mathscr{L}$ . The elements of $B$ can be also represented as

[TABLE]

Remark 3.

We can see that conditions of the theorem do not depend on the type of orthonormal polynomials.

Remark 4.

In case $\{\Psi_{\boldsymbol{\alpha}}(\vec{X}),\;\boldsymbol{\alpha}\in\mathscr{L}\}$ are multivariate polynomials, orthonormal w.r.t. the distribution $\mathscr{H}$ , we get that $\Sigma=I\in\mathbb{R}^{P\times P}$ is the identity matrix.

Remark 5.

In the proof of theorem 1 we are trying to make as less assumptions as possible in order to depart from original polynomial chaos model (3) as little as possible. That is why the only important assumption is that $\varepsilon=\varepsilon(\vec{X})$ and $\{\Psi_{j}(\vec{X})\}_{j=0}^{P-1}$ are orthogonal w.r.t. the distribution $\mathscr{H}$ . However, we can also consider model (5) as a regression one, and so the error term $\varepsilon$ is modelled by a white noise, independent from $\{\Psi_{j}(\vec{X})\}_{j=0}^{P-1}$ , see the discussion of the polynomial chaos approach from a statistician’s perspective in [31]. Nevertheless, even in the case of such interpretation of model (5) we still get the same asymptotic behavior (14).

Remark 6.

In case $\varepsilon$ and $\Psi_{k}(\vec{X})$ are not only orthogonal for $k<P$ , but also are independent, we get that $\Gamma=\sigma^{2}\Sigma$ . Then asymptotics (14) takes the form

[TABLE]

In applications it seems reasonable to assume that $\varepsilon$ and $\Psi_{k}(\vec{X})$ are approximately independent for $k<P$ . Then for practical purposes we can use asymptotics (17), for which it is easier to calculate the asymptotic covariance matrix. Therefore in the sequel for applications we are going to use this simplified expression.

4 Design of Experiments Construction

4.1 Preliminary Considerations

Taking into account the results of Theorem 1, the limiting covariance matrix of the indices estimates depends on

Noise variance $\sigma^{2}$ , 2. 2.

True values of PC coefficients $\mathbf{c}$ , defining $B$ , 3. 3.

Experimental design $X$ , defining $\Sigma$ .

If we have a sufficiently accurate approximation of the original model, then in the above assumptions, asymptotic covariance in (17) provides a theoretically motivated functional to characterize the quality of the experimental design. Indeed, generally speaking the smaller the norm of the covariance matrix $\|\sigma^{2}B\Sigma^{-1}B^{T}\|$ , the better the estimation of the sensitivity indices apparently should be. Theoretically, we could use this formula for constructing an experimental design that is effective for calculating Sobol’ indices: we could select designs that minimize the norm of the covariance matrix. However, there are some problems when proceeding this way:

•

The first one relates to selecting some specific functional for minimization. Informally speaking, we need to choose “the norm” associated with the limiting covariance matrix;

•

The second one refers to the fact that we do not know true values of the PC model coefficients, defining $B$ ; therefore, we will not be able to accurately evaluate the quality of the design.

The first problem can be solved in different ways. A number of statistical criteria for design optimality ( $D$ -, $I$ -optimality and others, see [32]) are known. Similar to the work [36], we use the $D$ -optimality criterion, as it a provides computationally efficient procedure for design construction. $D$ -optimal experimental design minimizes the determinant of the limiting covariance matrix. If the vector of the estimated parameters is normally distributed then $D$ -optimal design allows to minimize the volume of the confidence region for this vector.

The second problem is more complex. The optimal design for estimating sensitivity indices that minimizes the norm of limiting covariance matrix depends on true values of the indices, so it can be constructed only if these true values are known. However, in this case design construction makes no sense.

The dependency of the optimal design for indices evaluation on the true model parameters is a consequence of the indices estimates nonlinearity w.r.t. the PC model coefficients. In order to underline this dependency, the term “locally $D$ -optimal design” is commonly used [33]. In this setting there are several approaches, which are usually associated with either some assumptions about the unknown parameters, or adaptive design construction (see [33]). We use the latter approach.

In the case of adaptive designs, new design points are generated sequentially based on current estimates of the unknown parameters. This allows to avoid prior assumptions on these parameters. However, this approach has a problem with a confidence of the solution found: if at some step of the design construction process parameters estimates are significantly different from their true values, then the design, which is constructed based on these estimates, may lead to new parameters estimates, which are even more different from the true values.

In practice, during the construction of adaptive design, the quality of the approximation model and assumptions on non-degeneracy of results can be checked at each iteration and one can control and adjust the adaptive strategy.

4.2 Adaptive DoE Algorithm

In this section, we introduce the adaptive algorithm for constructing a design of experiments that is effective to estimate sensitivity indices based on the asymptotic $D$ -optimality criterion (see description of Algorithm 1 and its scheme in Figure 1). As it was discussed, the main idea of the algorithm is to minimize the confidence region for indices estimates. At each iteration, we replace the limiting covariance matrix by its approximation based on the current PC coefficients estimates.

As for initialization, we assume that there is some initial design, and we require that this initial design is non-degenerate, i.e. such that the initial information matrix $A_{0}$ is nonsingular ( $\det A_{0}\neq 0$ ). In addition, at each iteration the non-degeneracy of the matrix $B_{i}A_{i}^{-1}B_{i}^{T}$ , related to the criterion to be minimized, is checked.

4.3 Details of the Optimization procedure

The idea behind the proposed optimization procedure is analogous to the idea of the Fedorov’s algorithm for constructing optimal designs [34]. In order to simplify optimization problem (18), we use two well-known identities:

•

Let $M$ be some nonsingular square matrix, $\mathbf{t}$ and $\mathbf{w}$ be vectors such that $1+\mathbf{w}^{T}M^{-1}\mathbf{t}\neq 0$ , then

[TABLE]

•

Let $M$ be some nonsingular square matrix, $\mathbf{t}$ and $\mathbf{w}$ be vectors of appropriate dimensions, then

[TABLE]

Let us define $D\triangleq B(A+\boldsymbol{\Psi}(\mathbf{x})\boldsymbol{\Psi}^{T}(\mathbf{x}))^{-1}B^{T}$ , then applying (19), we obtain

[TABLE]

where $M\triangleq BA^{-1}B^{T}$ , $\mathbf{t}\triangleq\frac{BA^{-1}\boldsymbol{\Psi}(\mathbf{x})}{1+\boldsymbol{\Psi}^{T}(\mathbf{x})A^{-1}\boldsymbol{\Psi}(\mathbf{x})}$ , $\mathbf{w}\triangleq BA^{-1}\boldsymbol{\Psi}(\mathbf{x})$ . Assuming that matrix $M$ is nonsingular and applying (20), we obtain

[TABLE]

The resulting optimization problem is

[TABLE]

or explicitly (18) is reduced to

[TABLE]

5 Benchmark

In this section, we validate the proposed algorithm on a set of computational models with different input dimensions. Several analytic problems and two industrial problems based on finite element models are considered. Input parameters (variables) of the considered models have independent uniform and independent normal distributions. For some models, additionally independent gaussian noise is added to their outputs.

At first, we form non-degenerate random initial design, and then we use various techniques to add new design points iteratively. We compare our method for design construction (denoted as Adaptive for SI) with the following methods:

•

Random method iteratively adds new design points randomly from the set of candidate design points $\Xi$ ;

•

Adaptive D-opt iteratively adds new design points that maximize the determinant of information matrix (10): $\det A_{n}\to\max_{\mathbf{x}_{n}\in\Xi}$ ([34]). The resulting design is optimal, in some sense, for estimation of the PCE model coefficients. We compare our method with this approach to prove that it gives some advantage over usual $D$ -optimality. Strictly speaking, $D$ -optimal design is not iterative but if we have an initial training sample then the sequential approach seems a natural generalization of a common $D$ -optimal designs.

•

LHS. Unlike other considered designs, this method is not iterative as a completely new design is generated at each step. This method uses Latin Hypercube Sampling, and it is common to compute PCE coefficients.

The metric of design quality is the mean error defined as the distance between estimated and true indices $\sqrt{\sum_{i=1}^{d}(S_{i}-\hat{S}_{i}^{\text{\>run}})^{2}}$ averaged over runs with different random initial designs ( $200$ - $400$ runs). We consider not only the mean error but also its variance. Particularly, we use Welch’s t-test (see [35]) to ensure that the difference of mean distances is statistically significant for the considered methods. Note that lower p-values correspond to bigger confidence.

In all cases, we assume that the truncation set (retained PCE terms) is selected before an experiment.

5.1 Analytic Functions

The Sobol’ function

is commonly used for benchmarking methods in global sensitivity analysis

[TABLE]

where $x_{i}\sim\mathcal{U}(0,1)$ . In our case parameters $d=3$ , $c=(0.0,1.0,1.5)$ are used. Independent gaussian noise is added to the output of the function. The standard deviation of noise is [math] (without noise), $0.2$ and $1.4$ that corresponds to $0\%$ , $28\%$ and $194\%$ of the function standard deviation, caused by the inputs uncertainty. Analytical expressions for the corresponding sensitivity indices are available in [11].

Ishigami function

is also commonly used for benchmarking of global sensitivity analysis:

[TABLE]

where $x_{i}\sim\mathcal{U}(-\pi,\pi)$ . Theoretical values for its sensitivity indices are available in [16].

Environmental function

models a pollutant spill caused by a chemical accident [44]

[TABLE]

where $I$ is the indicator function; $4$ input variables and their distributions are defined as: $M$ $\sim$ $\mathcal{U}(7,13)$ , mass of pollutant spilled at each location; $D$ $\sim$ $\mathcal{U}(0.02,0.12)$ , diffusion rate in the channel; $L$ $\sim$ $\mathcal{U}(0.01,3)$ , location of the second spill; $\tau$ $\sim$ $\mathcal{U}(30.01,30.295)$ , time of the second spill. $C(\mathbf{x})$ is the concentration of the pollutant at the space-time vector $(s,\;t)$ , where $0\leq s\leq 3$ and $t>0$ .

We consider a cross-section corresponding to $t=40$ , $s=1.5$ and suppose that independent gaussian noise $\mathcal{N}(0,0.5^{2})$ is added to the output of the function.

The Borehole function

models water flow through a borehole. It is commonly used for testing different methods in numerical experiments [42, 43]

[TABLE]

where $8$ input variables and their distributions are defined as: $r_{w}$ $\sim$ $\mathcal{U}(0.05,0.15)$ , radius of borehole (m); $T_{u}$ $\sim$ $\mathcal{U}(63070,115600)$ , transmissivity of upper aquifer ( $m^{2}$ /yr); $r$ $\sim$ $\mathcal{U}(100,50000)$ , radius of influence (m); $H_{u}$ $\sim$ $\mathcal{U}(990,1110)$ , potentiometric head of upper aquifer (m); $T_{l}$ $\sim$ $\mathcal{U}(63.1,116)$ , transmissivity of lower aquifer ( $m^{2}$ /yr); $H_{l}$ $\sim$ $\mathcal{U}(700,820)$ , potentiometric head of lower aquifer (m); $L$ $\sim$ $\mathcal{U}(1120,1680)$ , length of borehole (m); $K_{w}$ $\sim$ $\mathcal{U}(9855,12045)$ , hydraulic conductivity of borehole (m/yr).

Besides the deterministic case, we also consider stochastic one when independent gaussian noise $\mathcal{N}(0,5.0^{2})$ is added to the output of the function.

The Wing Weight function

models weight of an aircraft wing [38]

[TABLE]

where $10$ input variables and their distributions are defined as: $S_{w}$ $\sim$ $\mathcal{U}(150,200)$ , wing area ( $ft^{2}$ ); $W_{fw}$ $\sim$ $\mathcal{U}(220,300)$ , weight of fuel in the wing (lb); $A$ $\sim$ $\mathcal{U}(6,10)$ , aspect ratio; $\Lambda$ $\sim$ $\mathcal{U}(-10,10)$ , quarter-chord sweep (degrees); $q$ $\sim$ $\mathcal{U}(16,45)$ , dynamic pressure at cruise (lb/ $ft^{2}$ ); $\lambda$ $\sim$ $\mathcal{U}(0.5,1)$ , taper ratio; $t_{c}$ $\sim$ $\mathcal{U}(0.08,0.18)$ , aerofoil thickness to chord ratio; $N_{z}$ $\sim$ $\mathcal{U}(2.5,6)$ , ultimate load factor; $W_{dg}$ $\sim$ $\mathcal{U}(1700,2500)$ , flight design gross weight (lb); $W_{p}$ $\sim$ $\mathcal{U}(0.025,0.08)$ , paint weight (lb/ $ft^{2}$ ).

Besides the deterministic case, we also consider stochastic one when independent gaussian noise $\mathcal{N}(0,5.0^{2})$ is added to the output of the function.

Experimental setup:

In the experiments, we assume that the set of candidate design points $\Xi$ is a uniform grid in the $d$ -dimensional hypercube. Note that $\Xi$ affects optimization quality. Experimental settings for analytical functions are summarized in Table 1.

5.2 Finite Element Models

Case 1: Truss model.

The deterministic computational model, originating from [39], resembles the displacement $V_{1}$ of a truss structure with $23$ members as shown in Figure 2.

Ten random variables are considered:

•

$E_{1}$ , $E_{2}$ (Pa) $\sim$ $\mathcal{U}(1.68\times 10^{11},\;2.52\times 10^{11})$ ;

•

$A_{1}$ ( $m^{2}$ ) $\sim$ $\mathcal{U}(1.6\times 10^{-3},\;2.4\times 10^{-3})$ ;

•

$A_{2}$ ( $m^{2}$ ) $\sim$ $\mathcal{U}(0.8\times 10^{-3},\;1.2\times 10^{-3})$ ;

•

$P_{1}$ - $P_{6}$ (N) $\sim$ $\mathcal{U}(3.5\times 10^{4},\;6.5\times 10^{4})$ .

It is assumed that all the horizontal elements have perfectly correlated Young’s modulus and cros-sectional areas with each other and so is the case with the diagonal members.

Case 2: Heat transfer model.

We consider the two-dimensional stationary heat diffusion problem described in [40]. The problem is defined on the square domain $D=(-0.5,0.5)\times(-0.5,0.5)$ shown in Figure 3(a), where the temperature field $T(z),z\in D$ is described by the partial differential equation:

[TABLE]

with boundary conditions $T=0$ on the top boundary and $\nabla T\mathbf{n}=0$ on the left, right and bottom boundaries, where $\mathbf{n}$ denotes the vector normal to the boundary; $A=(0.2,0.3)\times(0.2,0.3)$ is a square domain within $D$ and $I_{A}$ is the indicator function of $A$ . The diffusion coefficient, $\kappa(z)$ , is a lognormal random field defined by

[TABLE]

where $g(z)$ is a standard Gaussian random field and the parameters $a_{k}$ and $b_{k}$ are such that the mean and standard deviation of $\kappa$ are $\mu_{\kappa}=1$ and $\sigma_{\kappa}=0.3$ , respectively. The random field $g(z)$ is characterized by an autocorrelation function $\rho(z,z^{\prime})=\exp(-\|z-z^{\prime}\|^{2}/0.2^{2})$ . The quantity of interest, $Y$ , is the average temperature in the square domain $B=(-0.3,-0.2)\times(-0.3,-0.2)$ within $D$ (see Figure 3(a)).

To facilitate solution of the problem, the random field $g(z)$ is represented using the Expansion Optimal Linear Estimation (EOLE) method (see [41]). By truncating the EOLE series after the first $M$ terms, $g(z)$ is approximated by

[TABLE]

In the above equation, $\{\xi_{1},\ldots,\xi_{M}\}$ are independent standard normal variables; $\mathbf{C}_{z\zeta}$ is a vector with elements $\mathbf{C}_{z\zeta}^{(k)}=\rho(z,\zeta_{k})$ , where $\{\zeta_{1},\ldots,\zeta_{M}\}$ are the points of an appropriately defined mesh in $D$ ; and $(\ell_{i},\phi_{i})$ are the eigenvalues and eigenvectors of the correlation matrix $\mathbf{C}_{\zeta\zeta}$ with elements $\mathbf{C}_{\zeta\zeta}^{(k,\ell)}=\rho(\zeta_{k},\zeta_{\ell})$ , where $k,\ell=1,\ldots,n$ . We select $M=53$ in order to satisfy inequality

[TABLE]

The underlying deterministic problem is solved with an in-house finite-element analysis code. The employed finite-element discretization with triangular $T3$ elements is shown in Figure 3(a). Figure 3(b) shows the temperature fields corresponding to two example realizations of the diffusion coefficient.

Experimental Setup.

For these finite element models, we assume that the set of candidate design points $\Xi$ is

•

a uniform grid in the $10$ -dimensional hypercube for the Truss model;

•

LHS design with normally distributed variables in $53$ -dimensional space for the Heat transfer model.

Experimental settings for all models are summarized in Table 2.

5.3 Results

Figures 4(a), 4(b), 4(c), 5, 6, 7(a), 7(b), 8(a), 8(b) show results for analytic functions. Figures 9 and 10 present results for finite element models. We provide here mean errors, relative mean errors w.r.t the proposed method and $p$ -values to ensure that the difference of mean errors is statistically significant.

In the presented experiments, the proposed method performs better than other considered methods in terms of the mean error of estimated indices. Particularly note its superiority over standard LHS approach that is commonly used in practice. The difference in mean errors is statistically significant according to Welch’s t-test.

Comparison of figures 4(a), 4(b), 4(c) with different levels of additive noise shows that the proposed method is effective when the analyzed function is deterministic or when the noise level is small.

Because of robust problem statement and limited accuracy of the optimization, the algorithm may produce duplicate design points. Actually, it’s a common situation for locally $D$ -optimal designs [33]. If the computational model is deterministic, one may modify the algorithm, e.g. exclude repeated design points.

Although high dimensional optimization problems may be computationally prohibitive, the proposed approach is still useful in high dimensional settings. We propose to generate a uniform candidate set (e.g. LHS design of large size) and then choose its subset for the effective calculation of Sobol’ indices using our adaptive method, see results for Heat transfer model in Figure 10 (note that due to computational complexity we provide for this model results only for $2$ iterations of the LHS method).

It should be noted that in all presented cases the specification of sufficiently accurate PCE model (reasonable values for degree $p$ and $q$ -norm defining the truncation set) is assumed to be known a priori and the size of the initial training sample is sufficiently large. If we use an inadequate specification of the PCE model (e.g. quadratic PCE in case of cubic analyzed function), the method will perform worse in comparison with methods which do not depend on PCE model structure. In any case, usage of inadequate PCE models may lead to inaccurate results. That is why it is very important to control PCE model error during the design construction. For example, one may use cross-validation for this purpose [29]. Thus, if the PCE model error increases during design construction this may indicate that the model specification is inadequate and should be changed.

6 Conclusions

We proposed the design of experiments algorithm for evaluation of Sobol’ indices from PCE metamodel. The method does not depend on a particular form of orthonormal polynomials in PCE. It can be used for the case of different distributions of input parameters, defining the analyzed computational models.

The main idea of the method comes from metamodeling approach. We assume that the computational model is close to its approximating PCE metamodel and exploit knowledge of a metamodel structure. This allows us to improve the evaluation accuracy. All comes with a price: if additional assumptions on the computational model to provide good performance are not satisfied, one may expect accuracy degradation. Fortunately, in practice, we can control approximation quality during design construction and detect that we have selected inappropriate model. Note that from a theoretical point of view, our asymptotic considerations (w.r.t. the training sample size) simplify the problem of accuracy evaluation for the estimated indices.

Our experiments demonstrate: if PCE specification defined by the truncation scheme is appropriate for the given computational model and the size of the training sample is sufficiently large, then the proposed method performs better in comparison with standard approaches for design construction.

Bibliography44

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K.J.Beven. Rainfall-Runoff Modelling. The Primer, Wiley, 360 pp, Chichester, 2000.
2[2] P. Dayan and L. F. Abbott (2001). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems (MIT Press, Cambridge, MA).
3[3] S. Grihon, E.V. Burnaev, M.G. Belyaev, and P.V. Prikhodko. Surrogate Modeling of Stability Constraints for Optimization of Composite Structures. Surrogate-Based Modeling and Optimization. Engineering applications. Eds. by S. Koziel, L. Leifsson. Springer, 2013. P. 359—391.
4[4] A. Saltelli, K. Chan, M. Scott. Sensitivity analysis. Probability and statistics series. West Sussex: Wiley, (2000)
5[5] M.D. Morris. Factorial sampling plans for preliminary computational experiments. Technometrics, 33:161—174, 1991.
6[6] B. Iooss, P. Lemaitre. (2015) A review on global sensitivity analysis methods. In: Meloni C, Dellino G (eds) Uncertainty management in Simulation-Optimization of Complex Systems: Algorithms and Applications, Springer
7[7] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, et al. Global Sensitivity Analysis - The Primer. Wiley (2008)
8[8] J. Yang. (2011). Convergence and uncertainty analyses in Monte-Carlo based sensitivity analysis. Environ. Modell. Softw. 26, 444—457.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions

Abstract

1 Introduction

2 Sensitivity Indices and PCE Metamodel

2.1 Sensitivity Indices

Definition 1**.**

Remark 1**.**

2.2 Polynomial Chaos Expansions

2.3 PCE post-processing for sensitivity analysis

3 Asymptotic Properties

Theorem 1**.**

Proof.

Remark 2**.**

Remark 3**.**

Remark 4**.**

Remark 5**.**

Remark 6**.**

4 Design of Experiments Construction

4.1 Preliminary Considerations

4.2 Adaptive DoE Algorithm

4.3 Details of the Optimization procedure

5 Benchmark

5.1 Analytic Functions

The Sobol’ function

Ishigami function

Environmental function

The Borehole function

The Wing Weight function

Experimental setup:

5.2 Finite Element Models

Case 1: Truss model.

Case 2: Heat transfer model.

Experimental Setup.

5.3 Results

6 Conclusions

Definition 1.

Remark 1.

Theorem 1.

Remark 2.

Remark 3.

Remark 4.

Remark 5.

Remark 6.