A distributed active subspace method for scalable surrogate modeling of   function valued outputs

Hayley Guy; Alen Alexanderian; Meilin Yu

arXiv:1908.02694·physics.comp-ph·August 9, 2019·J. Sci. Comput.

A distributed active subspace method for scalable surrogate modeling of function valued outputs

Hayley Guy, Alen Alexanderian, Meilin Yu

PDF

TL;DR

This paper introduces a scalable surrogate modeling approach combining active subspaces and Karhunen-Loève expansions for high-dimensional, function-valued outputs, enabling efficient approximation of complex physical processes.

Contribution

It presents a novel distributed active subspace method integrated with KL expansions, along with an error analysis and an application example demonstrating its effectiveness.

Findings

01

Effective dimension reduction for high-dimensional inputs.

02

Accurate surrogate models for function-valued outputs.

03

Scalable gradient computation via adjoint methods.

Abstract

We present a distributed active subspace method for training surrogate models of complex physical processes with high-dimensional inputs and function valued outputs. Specifically, we represent the model output with a truncated Karhunen-Lo\`eve (KL) expansion, screen the structure of the input space with respect to each KL mode via the active subspace method, and finally form an overall surrogate model of the output by combining surrogates of individual output KL modes. To ensure scalable computation of the gradients of the output KL modes, needed in active subspace discovery, we rely on adjoint-based gradient computation. The proposed method combines benefits of active subspace methods for input dimension reduction and KL expansions used for spectral representation of the output field. We provide a mathematical framework for the proposed method and conduct an error analysis of the mixed…

Equations140

y=f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),

y=f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),

\begin{array}[]{cc}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}}f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}),\\ \end{array}

\begin{array}[]{cc}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}}f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}),\\ \end{array}

y=g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\mathbb{R}^{N_{p}}.

y=g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\mathbb{R}^{N_{p}}.

\mathbf{{S}}=\int_{\Omega}\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\,\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})^{T}\pi({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}},

\mathbf{{S}}=\int_{\Omega}\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\,\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})^{T}\pi({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}},

S = W Λ W^{T}, Λ = diag (λ_{1}, \dots, λ_{N_{p}}) .

S = W Λ W^{T}, Λ = diag (λ_{1}, \dots, λ_{N_{p}}) .

Λ = [Λ_{1} Λ_{2}], W = [W_{1} W_{2}],

Λ = [Λ_{1} Λ_{2}], W = [W_{1} W_{2}],

{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}=\mathbf{{W}}_{1}\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}+\mathbf{{W}}_{2}\mathbf{{W}}_{2}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}=\mathbf{{W}}_{1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}+\mathbf{{W}}_{2}{\mathchoice{\mbox{\boldmath$\displaystyle{z}$}}{\mbox{\boldmath$\textstyle{z}$}}{\mbox{\boldmath$\scriptstyle{z}$}}{\mbox{\boldmath$\scriptscriptstyle{z}$}}}.

{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}=\mathbf{{W}}_{1}\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}+\mathbf{{W}}_{2}\mathbf{{W}}_{2}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}=\mathbf{{W}}_{1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}+\mathbf{{W}}_{2}{\mathchoice{\mbox{\boldmath$\displaystyle{z}$}}{\mbox{\boldmath$\textstyle{z}$}}{\mbox{\boldmath$\scriptstyle{z}$}}{\mbox{\boldmath$\scriptscriptstyle{z}$}}}.

G({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})=g(\mathbf{{W}}_{1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}\in{\mathbb{R}}^{r},

G({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})=g(\mathbf{{W}}_{1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}\in{\mathbb{R}}^{r},

g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx G(\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}).

g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx G(\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}).

\mathbf{{S}}\approx\frac{1}{N_{s}}\sum_{i=1}^{N_{s}}\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}_{i})\,\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}_{i})^{T}.

\mathbf{{S}}\approx\frac{1}{N_{s}}\sum_{i=1}^{N_{s}}\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}_{i})\,\nabla g({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}_{i})^{T}.

\mathbf{{S}}=\frac{1}{N_{s}}\sum_{i=1}^{N_{s}}{\mathchoice{\mbox{\boldmath$\displaystyle{D}$}}{\mbox{\boldmath$\textstyle{D}$}}{\mbox{\boldmath$\scriptstyle{D}$}}{\mbox{\boldmath$\scriptscriptstyle{D}$}}}_{i}{\mathchoice{\mbox{\boldmath$\displaystyle{D}$}}{\mbox{\boldmath$\textstyle{D}$}}{\mbox{\boldmath$\scriptstyle{D}$}}{\mbox{\boldmath$\scriptscriptstyle{D}$}}}_{i}^{T}.

\mathbf{{S}}=\frac{1}{N_{s}}\sum_{i=1}^{N_{s}}{\mathchoice{\mbox{\boldmath$\displaystyle{D}$}}{\mbox{\boldmath$\textstyle{D}$}}{\mbox{\boldmath$\scriptstyle{D}$}}{\mbox{\boldmath$\scriptscriptstyle{D}$}}}_{i}{\mathchoice{\mbox{\boldmath$\displaystyle{D}$}}{\mbox{\boldmath$\textstyle{D}$}}{\mbox{\boldmath$\scriptstyle{D}$}}{\mbox{\boldmath$\scriptscriptstyle{D}$}}}_{i}^{T}.

y=f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\in\mathcal{X},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\Omega,

y=f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\in\mathcal{X},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\Omega,

c({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}):=\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\cdot)f({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}},\cdot)]-\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\cdot)]\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}},\cdot)]

c({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}):=\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\cdot)f({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}},\cdot)]-\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\cdot)]\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}},\cdot)]

[C_{f}u]({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}):=\int_{\mathcal{X}}c({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})u({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}.

[C_{f}u]({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}):=\int_{\mathcal{X}}c({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})u({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}.

\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})=\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\xi)]

\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})=\mathbb{E}[f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},\xi)]

\begin{array}[]{cc}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\hat{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}(C_{f})}f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}).\end{array}

\begin{array}[]{cc}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\hat{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}(C_{f})}f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}).\end{array}

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\dfrac{1}{\sqrt{\lambda_{k}(C_{f})}}\int_{\mathcal{X}}(f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})-\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}))\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}.

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\dfrac{1}{\sqrt{\lambda_{k}(C_{f})}}\int_{\mathcal{X}}(f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})-\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}))\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}.

C_{f}\phi_{k}=\lambda_{k}(C_{f})\phi_{k},\quad\int_{\mathcal{X}}\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})^{2}\,d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}=1,\quad k=1,2,\ldots,

C_{f}\phi_{k}=\lambda_{k}(C_{f})\phi_{k},\quad\int_{\mathcal{X}}\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})^{2}\,d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}=1,\quad k=1,2,\ldots,

\mathbf{{S}}_{k}=\int_{\mathbb{R}^{N_{p}}}\big{(}\nabla f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\big{)}\big{(}\nabla f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\big{)}^{T}\,\pi({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}.

\mathbf{{S}}_{k}=\int_{\mathbb{R}^{N_{p}}}\big{(}\nabla f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\big{)}\big{(}\nabla f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\big{)}^{T}\,\pi({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}.

Λ_{k} = [Λ_{k, 1} Λ_{k, 2}], W_{k} = [W_{k, 1} W_{k, 2}],

Λ_{k} = [Λ_{k, 1} Λ_{k, 2}], W_{k} = [W_{k, 1} W_{k, 2}],

G_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})=f_{k}(\mathbf{{W}}_{k,1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}\in{\mathbb{R}}^{r_{k}},k=1,\ldots,N,

G_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}})=f_{k}(\mathbf{{W}}_{k,1}{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}),\quad{\mathchoice{\mbox{\boldmath$\displaystyle{y}$}}{\mbox{\boldmath$\textstyle{y}$}}{\mbox{\boldmath$\scriptstyle{y}$}}{\mbox{\boldmath$\scriptscriptstyle{y}$}}}\in{\mathbb{R}}^{r_{k}},k=1,\ldots,N,

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx G_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad k=1,\ldots,N.

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx G_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad k=1,\ldots,N.

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\tilde{G}_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad k=1,\ldots,N.

f_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\tilde{G}_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}),\quad k=1,\ldots,N.

f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\hat{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}}\tilde{G}_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}).

f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\approx\hat{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})+\sum\limits_{k=1}^{N}\sqrt{\lambda_{k}}\tilde{G}_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}).

F({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\int_{\mathcal{D}}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},

F({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\int_{\mathcal{D}}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},

\left|\frac{\partial f}{\partial\xi_{j}}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\right|\leq b_{j}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}),\quad\text{for all }{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\in\mathcal{D},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\Omega,

\left|\frac{\partial f}{\partial\xi_{j}}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})\right|\leq b_{j}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}),\quad\text{for all }{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\in\mathcal{D},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}\in\Omega,

|\partial_{j}F({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})|\leq\int_{\mathcal{D}}|\partial_{j}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})|d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\leq\int_{\mathcal{D}}|b_{j}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})||v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})|\,d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\leq\|b_{j}\|_{L^{2}(\mathcal{D})}\|v\|_{L^{2}(\mathcal{D})},

|\partial_{j}F({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})|\leq\int_{\mathcal{D}}|\partial_{j}f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})|d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\leq\int_{\mathcal{D}}|b_{j}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})||v({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})|\,d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}\leq\|b_{j}\|_{L^{2}(\mathcal{D})}\|v\|_{L^{2}(\mathcal{D})},

F_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\int_{\mathcal{X}}(f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})-\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}))\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}.

F_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=\int_{\mathcal{X}}(f({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})-\bar{f}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}))\phi_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}})d{\mathchoice{\mbox{\boldmath$\displaystyle{x}$}}{\mbox{\boldmath$\textstyle{x}$}}{\mbox{\boldmath$\scriptstyle{x}$}}{\mbox{\boldmath$\scriptscriptstyle{x}$}}}.

{\mathchoice{\mbox{\boldmath$\displaystyle{F}$}}{\mbox{\boldmath$\textstyle{F}$}}{\mbox{\boldmath$\scriptstyle{F}$}}{\mbox{\boldmath$\scriptscriptstyle{F}$}}}_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-\bar{{\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}})^{T}\mathbf{{W}}{\mathchoice{\mbox{\boldmath$\displaystyle{\phi}$}}{\mbox{\boldmath$\textstyle{\phi}$}}{\mbox{\boldmath$\scriptstyle{\phi}$}}{\mbox{\boldmath$\scriptscriptstyle{\phi}$}}}_{k},\quad\text{where}\quad\mathbf{{A}}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}){\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}={\mathchoice{\mbox{\boldmath$\displaystyle{b}$}}{\mbox{\boldmath$\textstyle{b}$}}{\mbox{\boldmath$\scriptstyle{b}$}}{\mbox{\boldmath$\scriptscriptstyle{b}$}}}.

{\mathchoice{\mbox{\boldmath$\displaystyle{F}$}}{\mbox{\boldmath$\textstyle{F}$}}{\mbox{\boldmath$\scriptstyle{F}$}}{\mbox{\boldmath$\scriptscriptstyle{F}$}}}_{k}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}})=({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-\bar{{\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}})^{T}\mathbf{{W}}{\mathchoice{\mbox{\boldmath$\displaystyle{\phi}$}}{\mbox{\boldmath$\textstyle{\phi}$}}{\mbox{\boldmath$\scriptstyle{\phi}$}}{\mbox{\boldmath$\scriptscriptstyle{\phi}$}}}_{k},\quad\text{where}\quad\mathbf{{A}}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}){\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}={\mathchoice{\mbox{\boldmath$\displaystyle{b}$}}{\mbox{\boldmath$\textstyle{b}$}}{\mbox{\boldmath$\scriptstyle{b}$}}{\mbox{\boldmath$\scriptscriptstyle{b}$}}}.

\mathcal{L}({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{q}$}}{\mbox{\boldmath$\textstyle{q}$}}{\mbox{\boldmath$\scriptstyle{q}$}}{\mbox{\boldmath$\scriptscriptstyle{q}$}}})=({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-\bar{{\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}})^{T}\mathbf{{W}}{\mathchoice{\mbox{\boldmath$\displaystyle{\phi}$}}{\mbox{\boldmath$\textstyle{\phi}$}}{\mbox{\boldmath$\scriptstyle{\phi}$}}{\mbox{\boldmath$\scriptscriptstyle{\phi}$}}}_{k}+{\mathchoice{\mbox{\boldmath$\displaystyle{q}$}}{\mbox{\boldmath$\textstyle{q}$}}{\mbox{\boldmath$\scriptstyle{q}$}}{\mbox{\boldmath$\scriptscriptstyle{q}$}}}^{T}\big{(}\mathbf{{A}}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}){\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-{\mathchoice{\mbox{\boldmath$\displaystyle{b}$}}{\mbox{\boldmath$\textstyle{b}$}}{\mbox{\boldmath$\scriptstyle{b}$}}{\mbox{\boldmath$\scriptscriptstyle{b}$}}}\big{)},

\mathcal{L}({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}},{\mathchoice{\mbox{\boldmath$\displaystyle{q}$}}{\mbox{\boldmath$\textstyle{q}$}}{\mbox{\boldmath$\scriptstyle{q}$}}{\mbox{\boldmath$\scriptscriptstyle{q}$}}})=({\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-\bar{{\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}})^{T}\mathbf{{W}}{\mathchoice{\mbox{\boldmath$\displaystyle{\phi}$}}{\mbox{\boldmath$\textstyle{\phi}$}}{\mbox{\boldmath$\scriptstyle{\phi}$}}{\mbox{\boldmath$\scriptscriptstyle{\phi}$}}}_{k}+{\mathchoice{\mbox{\boldmath$\displaystyle{q}$}}{\mbox{\boldmath$\textstyle{q}$}}{\mbox{\boldmath$\scriptstyle{q}$}}{\mbox{\boldmath$\scriptscriptstyle{q}$}}}^{T}\big{(}\mathbf{{A}}({\mathchoice{\mbox{\boldmath$\displaystyle{\xi}$}}{\mbox{\boldmath$\textstyle{\xi}$}}{\mbox{\boldmath$\scriptstyle{\xi}$}}{\mbox{\boldmath$\scriptscriptstyle{\xi}$}}}){\mathchoice{\mbox{\boldmath$\displaystyle{u}$}}{\mbox{\boldmath$\textstyle{u}$}}{\mbox{\boldmath$\scriptstyle{u}$}}{\mbox{\boldmath$\scriptscriptstyle{u}$}}}-{\mathchoice{\mbox{\boldmath$\displaystyle{b}$}}{\mbox{\boldmath$\textstyle{b}$}}{\mbox{\boldmath$\scriptstyle{b}$}}{\mbox{\boldmath$\scriptscriptstyle{b}$}}}\big{)},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Hayley Guy 22institutetext: Department of Mathematics, North Carolina State University

22email: [email protected]

33institutetext: Alen Alexanderian 44institutetext: Department of Mathematics, North Carolina State University

44email: [email protected]

55institutetext: Meilin Yu 66institutetext: Department of Mechanical Engineering, The University of Maryland, Baltimore County

66email: [email protected]

∎

A distributed active subspace method for scalable surrogate modeling

of function valued outputs

Hayley Guy

Alen Alexanderian

Meilin Yu

Abstract

We present a distributed active subspace method for training surrogate models of complex physical processes with high-dimensional inputs and function valued outputs. Specifically, we represent the model output with a truncated Karhunen–Loève (KL) expansion, screen the structure of the input space with respect to each KL mode via the active subspace method, and finally form an overall surrogate model of the output by combining surrogates of individual output KL modes. To ensure scalable computation of the gradients of the output KL modes, needed in active subspace discovery, we rely on adjoint-based gradient computation. The proposed method combines benefits of active subspace methods for input dimension reduction and KL expansions used for spectral representation of the output field. We provide a mathematical framework for the proposed method and conduct an error analysis of the mixed KL active subspace approach. Specifically, we provide an error estimate that quantifies errors due to active subspace projection and truncated KL expansion of the output. We demonstrate the numerical performance of the surrogate modeling approach with an application example from biotransport.

Keywords

Distributed active subspace; Karhunen–Loève expansion; Dimension reduction; Function valued outputs; Porous medium flow; Biotransport.

1 Introduction

Models with uncertain input parameters are common in modeling of complex systems. Computational studies such as forward uncertainty propagation, optimization, or parameter estimation require repeated evaluation of the model. These tasks become challenging for expensive-to-evaluate complex models. To address this challenge, surrogate models are often used. By approximating the mapping from the uncertain input parameters to output quantities of interest (QoIs), using a surrogate model, one can replace expensive model evaluations by inexpensive surrogate model evaluations. Examples of surrogate modeling tools include polynomial chaos expansion GhanemSpanos90 ; LeMaitreKnio10 , multivariate adaptive regression splines friedman93 , and Gaussian processes RasmussenWilliams06 .

In the present work, we consider surrogate construction for models of the form

[TABLE]

where $\textstyle{x}$ belongs to a spatial domain and ${\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\in\mathbb{R}^{N_{p}}$ is a vector of uncertain parameters. In our target applications $f$ is defined in terms of the solution of a partial differential equation (PDE) that is parameterized by a high-dimensional input parameter. Albeit, the proposed framework can be adapted to more general settings.

A simple approach is to construct a surrogate model pointwise in $\textstyle{x}$ . Specifically, discretizing the spatial domain by grid points $\{{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}_{i}\}_{i=1}^{N_{x}}$ , we may consider approximating $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}_{i},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , by constructing surrogate models $\hat{f}_{i}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\approx f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}_{i},\cdot)$ , $i=1,\ldots,N_{x}$ . This straightforward approach can be useful in some cases, however, the power of many of the surrogate modeling approaches can be fully realized if one optimizes them for each ${\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}_{i}$ in the computational grid. This is a computationally expensive task and might become prohibitive if the parameter dimension ${N_{p}}$ is large (e.g., ${N_{p}}\geq 100$ ), the model exhibits large variations in its response to parametric uncertainties over the spatial domain, or the grid point number $N_{x}$ is large, as can happen in two or three-dimensional geometries.

We seek an efficient surrogate modeling approach, for models of the form (1), whose complexity in terms of the number of model evaluations does not scale with the dimension ${N_{p}}$ of the input parameters. To enable this, we need a method that is structure exploiting. In particular, we seek to discover and utilize informative low-dimensional subspaces in the input space and low-dimensional spectral representations of the model output. The former uses ideas from active subspace methods Constantine15 and the latter uses Karhunen–Loève (KL) expansions Loeve77 ; GhanemSpanos90 . This proposed approach decouples the spatial (i.e. $\textstyle{x}$ ) dimensions and those of the random variable $\textstyle{\xi}$ and exposes important structures that can be used for building efficient surrogate models.

Related work. KL expansions have been used in many works for representing random field parameters in physics models; see e.g., Ghanem98 ; LeMaitreReaganNajmEtAl02 ; XiuKarniadakis03 ; LeMaitreKnio04 ; BabuvskaNobileTempone07 ; Doostan07 ; SaadGhanem09 ; MatthiesKeese05 ; Graham15KuoKuoNicholsEtAl15 ; Elman17 . KL expansions can also be used to represent random field outputs of physics models, as done in the present work. In models governed by PDEs the output field often exhibits favorable regularity properties and can be represented by a KL expansion with a relatively small number of KL terms. Examples of this appear for instance in our recent works CleavesAlexanderianGuyEtAl19 ; ARSY2019 for models governed by elliptic PDEs.

The active subspace method Russi10 ; constantine2014 ; Constantine15 has become a popular approach in recent years for input parameter dimension reduction and surrogate modeling. This method seeks to identify important linear combinations of the input parameters. A brief summary of the active subspace method, for approximating scalar valued models, has been provided in Section 2. The active subspace method has been successfully used for uncertainty analysis in models with scalar valued responses, in a host of engineering applications; examples include scramjet analysis Scramjet15 , wing shape optimization ShapeOpt14 , hydrologic modeling ConstantineGeo15 , battery modeling ConstantineBattery17 , and kinetic model uncertainty analysis ji2018shared . Recently, there have been efforts in extending the active subspace method to vector valued functions. In zahm2018gradient the authors find an upper bound for the error of a ridge function approximation of the original (vector valued) function, and construct an approximating function by minimzing that upper bound. In ji2018shared an interesting approach is introduced for simultaneously approximating multiple outputs using a single low-dimensional shared subspace. The shared subspace is identified by solving a least-squares system to compute an appropriate combination of single-output active subspaces.

We also mention another related and popular class of methods for parameter dimension reduction: variance based Sobol:1990 ; Sobol:2001 and derivative based SobolKucherenko09 ; KucherenkoIooss17 global sensitivity analysis (GSA). These methods provide means of identifying unimportant model inputs, hence reducing the dimension of the input parameter vector. Derivative based methods are especially attractive, as they can be used to efficiently screen for unimportant input parameters, after which a surrogate model can be computed as a function of a reduced set of parameters. While GSA approaches have traditionally been applied to models with scalar outputs, extensions of GSA methods to vectorial and function valued outputs appear in several recent works GamboaJanonKleinEtAl14 ; AlexanderianGremaudSmith17 ; CleavesAlexanderianGuyEtAl19 .

In practice, the active subspace method tends to be very effective in reducing the input dimension as important directions in the input parameter space are identified. This is, in contrast to seeking reduced parameter subsets, as identified by GSA approaches. We note however that utility of GSA goes beyond input parameter dimension reduction—GSA provides valuable insight regarding a model by identifying key contributors to model variability. Also, we mention an interesting link between derivative based GSA and active subspaces through the idea of activity scores as detailed in ConstantineDiaz17 ; computing active subspaces provides, as a byproduct, approximations to derivative based global sensitivity measures, for scalar valued models.

Our approach and contributions. In our proposed approach, we combine active subspaces and KL expansions to enable efficient surrogate modeling for function valued QoIs. Specifically, we consider a suitably truncated KL decomposition

[TABLE]

where $\bar{f}$ is the mean of the process $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ and $(\lambda_{k},\phi_{k})$ are the eigenpairs of the covariance operator of the process. This way, the model uncertainty is encoded in the KL modes, $f_{k}$ , $k=1,\ldots,N$ . In many cases a small $N$ can be very effective in approximating $f$ . These KL modes can then be approximated efficiently using the active subspace approach. The active subspace approach requires the gradients of $f_{k}$ , which are scalar functions of the random vector $\textstyle{\xi}$ . For such functions, adjoint state approaches provide efficient means of computing the required gradients. Specifically, gradients can be computed as a cost, in terms of model evaluations, that does not scale with the dimension of the input parameter $\textstyle{\xi}$ . Realizing the proposed approach requires (i) a rigorous functional framework upon which efficient computational algorithms can be built; and (ii) a systematic computational procedure that guides efficient implementations. These have been detailed in Section 3.

We deploy our proposed framework in the context of flows in biological tissues. Specifically, in Section 4, we tackle biotransport in porous tumors with high-dimensional random inputs (i.e., the permeability field) and random field outputs (i.e., the pressure distribution in tumors). Biological tissues usually have highly heterogeneous, even uncertain, material properties Frontier:16 . The biotransport process such as drug delivery in tissues thus can exhibit “unpredictable” behaviors due to the uncertainties in tissue material properties Deb:09CPD . Since the unpredictability can adversely affect the effectiveness of therapy, quantifying variability in biotransport due to uncertain material properties has a high impact on clinical trial protocol design. However, conducting useful uncertainty quantification studies on biotransport is challenging due to the high dimension of the input parameter space, e.g. permeability and porosity AlexanderianZhuSalloumEtAl17 . We find that efficient-to-evaluate and accurate surrogate models can be computed at a modest computational cost, with our proposed framework. The presented computational experiments also indicate that the present method can be used successfully in more general porous medium flow problems, where Darcy flow is a reasonable model. Further investigations of the proposed surrogate modeling method in more complex flow models is subject of our future work.

The contributions of this article are summarized as follows:

$\circ$

We present a mathematical framework and computational method for surrogate modeling for models with high-dimensional inputs and function valued outputs that uses KL expansions for output dimension reduction, active subspaces for input dimension reduction, and adjoint based gradient computation as needed in the active subspace method.

$\circ$

We analyze the errors due to active subspace projection and output KL truncation. The presented analysis provides insight on the interplay between these two important sources of errors.

$\circ$

We present comprehensive computational results in the context of a biotransport application problem that test various aspects of the proposed surrogate modeling method. We mention that the computational studies, to our knowledge, are the first of a kind in the area of biotransport modeling in tissues with uncertain heterogeneous material properties.

2 Background on active subspace

One common approach to input dimension reduction is to identify a subset of input model parameters that are the most important to model variability. This is done typically using a local or global sensitivity analysis approach KucherenkoIooss17 ; IoossSaltelli17 ; PrieurTarantola17 . The active subspace approach is different; it identifies a set of important directions in the input parameter space rather than giving importance to one input over another. We can rotate the coordinates to align with the directions of strongest variation. Each direction can be considered as a set of weights that define a linear combination of all of the inputs. Directions where the inputs do not vary much are ignored.

Consider a model

[TABLE]

We assume the uncertain parameters $\textstyle{\xi}$ have an associated probability density function $\pi({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ that is supported on $\Omega\subseteq\mathbb{R}^{N_{p}}$ . Assume $g$ is square integrable and has continuous partial derivatives with respect to $\textstyle{\xi}$ . The following matrix plays a key role in active subspace construction:

[TABLE]

where $\nabla g$ is the gradient of $g$ . The matrix $\mathbf{{S}}$ is symmetric and positive semi-definite with spectral decomposition

[TABLE]

The eigenvalues $\lambda_{i}$ ’s are sorted in descending order $\lambda_{1}\geq\cdots\geq\lambda_{N_{p}}\geq 0$ and $\mathbf{{W}}$ contains the orthonormal eigenvectors of $\mathbf{{S}}$ .

The active subspace is determined by the dominant eigenvectors of $\mathbf{{S}}$ . Specifically, we partition the eigenvalues and eigenvectors according to

[TABLE]

where $\mathbf{{\Lambda}}_{1}\in\mathbb{R}^{r\times r}$ is a diagonal matrix with the dominant eigenvalues on its diagonal; and $\mathbf{{W}}_{1}$ contains the corresponding eigenvectors ${\mathchoice{\mbox{\boldmath$ \displaystyle{w} $}}{\mbox{\boldmath$ \textstyle{w} $}}{\mbox{\boldmath$ \scriptstyle{w} $}}{\mbox{\boldmath$ \scriptscriptstyle{w} $}}}_{1},\ldots{\mathchoice{\mbox{\boldmath$ \displaystyle{w} $}}{\mbox{\boldmath$ \textstyle{w} $}}{\mbox{\boldmath$ \scriptstyle{w} $}}{\mbox{\boldmath$ \scriptscriptstyle{w} $}}}_{r}$ as its columns. These columns span the dominant eigenspace of $\mathbf{{S}}$ —the active subspace.

Given $\textstyle{\xi}$ we define ${\mathchoice{\mbox{\boldmath$ \displaystyle{y} $}}{\mbox{\boldmath$ \textstyle{y} $}}{\mbox{\boldmath$ \scriptstyle{y} $}}{\mbox{\boldmath$ \scriptscriptstyle{y} $}}}=\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\in\mathbb{R}^{r}$ and ${\mathchoice{\mbox{\boldmath$ \displaystyle{z} $}}{\mbox{\boldmath$ \textstyle{z} $}}{\mbox{\boldmath$ \scriptstyle{z} $}}{\mbox{\boldmath$ \scriptscriptstyle{z} $}}}=\mathbf{{W}}_{2}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\in\mathbb{R}^{{N_{p}}-r}$ , and note that

[TABLE]

The elements of $\textstyle{y}$ and $\textstyle{z}$ are the sets of active and inactive variables, respectively. As discussed in detail in constantine2014 , the active variables, i.e., elements of $\textstyle{y}$ , are responsible for most of the variations of the function $g$ .

We can consider approximating $g$ in the active subspace, $g({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\approx g(\mathbf{{W}}_{1}\mathbf{{W}}_{1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . It is convenient to define the function $G:\mathbb{R}^{r}\to\mathbb{R}$ by

[TABLE]

and write the approximation to $g$ as

[TABLE]

In practice, the function $G$ is typically approximated by a surrogate model. Specifically, we can compute a polynomial regression fit, which we denote by $\tilde{G}({\mathchoice{\mbox{\boldmath$ \displaystyle{y} $}}{\mbox{\boldmath$ \textstyle{y} $}}{\mbox{\boldmath$ \scriptstyle{y} $}}{\mbox{\boldmath$ \scriptscriptstyle{y} $}}})$ , for the function $G({\mathchoice{\mbox{\boldmath$ \displaystyle{y} $}}{\mbox{\boldmath$ \textstyle{y} $}}{\mbox{\boldmath$ \scriptstyle{y} $}}{\mbox{\boldmath$ \scriptscriptstyle{y} $}}})$ . Moreover, Monte Carlo sampling is used to approximate the matrix $\mathbf{{S}}$ in (2):

[TABLE]

Typically, a modest Monte Carlo sample is sufficient for computing reliable approximations to the dominant eigenvalues and eigenvectors of $\mathbf{{S}}$ .

For readers’ convenience, we provide the steps for computing an active subspace-based surrogate model for the function $g$ in Algorithm 1. For further details, we refer the readers to constantine2014 .

3 Active subspace-based surrogate models for function valued QoIs

In this section, we outline our approach for computing an active subspace-based surrogate model for a function valued QoI. Specifically, we consider models of the form

[TABLE]

where $\mathcal{X}\subset\mathbb{R}^{n}$ , with $n=2$ or $3$ , is a compact set, and $\Omega\subset\mathbb{R}^{N_{p}}$ is a sample space. The set $\mathcal{X}$ will be a (sub-) region of a computational domain, in our target applications. Here ${\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}=(\xi_{1},\xi_{2},\ldots,\xi_{{N_{p}}})^{T}$ is a vector of uncertain parameters, with probability density function $\pi({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . We make the following assumptions about $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ .

Assumption 1

We assume

(a)

$f\in L^{2}(\mathcal{X}\times\Omega)$ * and $f$ is a mean square continuous process.* 2. (b)

$\frac{\partial f}{\partial\xi_{i}}(x,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ * exists for all $x\in\mathcal{X},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\in\Omega,i=1,\ldots,{N_{p}}$ .* 3. (c)

$\frac{\partial f}{\partial\xi_{i}}(x,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\in L^{2}(\mathcal{X}\times\Omega),i=1,\ldots,{N_{p}}$ .

Letting $\mathbb{E}[\cdot]$ denote expectation with respect to $\textstyle{\xi}$ , the covariance function, $c:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R}$ , of $f$ is defined as

[TABLE]

and the associated covariance operator $C_{f}:L^{2}(\mathcal{X})\rightarrow L^{2}(\mathcal{X})$ is given by

[TABLE]

Assumption 1(a) ensures that the covariance function $c$ and the process mean

[TABLE]

are continuous on $\mathcal{X}\times\mathcal{X}$ and $\mathcal{X}$ , respectively; see e.g., (HsingEubank15, , Theorem 7.3.2).

3.1 Computational method

We reduce the dimension of the output by computing a low-rank approximation using a truncated KL expansion:

[TABLE]

Here $\lambda_{k}(C_{f})$ and $\phi_{k}$ are the eigenvalues and eigenvectors of the covariance operator $C_{f}$ and $f_{k}$ ’s are given by

[TABLE]

We refer to the coefficients $f_{k}$ as the KL modes of $f$ . In the present work, the (generalized) eigenvalue problem,

[TABLE]

is solved using Nyström’s method. In practice, $N$ can be chosen such that $(\sum_{k=1}^{N}\lambda_{k}(C_{f}))/(\sum_{k=1}^{\infty}\lambda_{k}(C_{f}))<\mathrm{tol}$ , where $\mathrm{tol}$ is a user-specified tolerance. In many applications of interest, where $f$ is defined in terms of the solution of a differential equation, the eigenvalues $\lambda_{k}(C_{f})$ exhibit rapid decay, which enables low-rank representations. In particular, this is observed in the application problem considered in the present work.

The next step is to approximate the KL modes $f_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ using active subspaces constantine2014 . For each KL mode $f_{k}$ , $k=1,\ldots,N$ , we compute an active subspace by considering the symmetric positive semidefinite matrix, $\mathbf{{S}}_{k}\in\mathbb{R}^{{N_{p}}\times{N_{p}}}$ , defined by

[TABLE]

As before, we compute the spectral decomposition $\mathbf{{S}}_{k}=\mathbf{{W}}_{k}\mathbf{{\Lambda}}_{k}\mathbf{{W}}_{k}^{T}$ , and partition the eigenpairs according to

[TABLE]

where $\mathbf{{\Lambda}}_{k,1}\in\mathbb{R}^{r_{k}\times r_{k}}$ is a diagonal matrix with the dominant eigenvalues of $\mathbf{{S}}_{k}$ on its diagonal; and $\mathbf{{W}}_{k}$ contains the corresponding eigenvectors. Defining,

[TABLE]

the KL modes can be approximated by

[TABLE]

In practice, the active subspace-based surrogates for the KL modes $f_{k}$ are constructed by following Algorithm 1, with $g$ replaced by $f_{k}$ , $k=1,\ldots,N$ . Thus, we obtain surrogate models

[TABLE]

The overall surrogate model for $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ is then given by

[TABLE]

The computational steps for computing $\hat{f}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ can be divided into three main steps:

Compute the truncated KL expansion of $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . 2. 2.

Compute the active subspace-based approximation to KL modes $f_{k}$ , $k=1,\ldots,N$ . 3. 3.

Form the overall surrogate model as in (7).

The most computationally challenging part of the above process is the first step, in which we require an ensemble of model evaluations $f(\cdot,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})$ , $i=1,\ldots,N_{s}$ . These model evaluations will be used to compute the KL expansion of the model $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . For this, we use Algorithm 1 in ARSY2019 , that uses Nyström’s method to compute $\lambda_{k}(C_{f})$ and the corresponding eigenvectors $\phi_{k}(\cdot)$ , $k=1,\ldots,N$ . This process also provides the evaluations of the KL modes, $f_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i}),k=1,\ldots,N,\,i=1,\ldots,N_{s}$ . As shown in our numerical results, often a modest choice of $N_{s}$ is sufficient for obtaining reliable approximations to (i) the dominant eigenpairs, and (ii) KL modes $f_{k}$ .

Notice that for implementing the proposed method, differentiability of the KL modes is required. Moreover, as seen below, for the purposes of error analysis, Lipschitz continuity of the output KL modes is needed. These requirements can be satisfied through suitable boundedness assumptions on the partial derivatives of $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , as we now explain. For convenience, and with no loss of generality, we consider the case where $\bar{f}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}})\equiv 0$ and consider

[TABLE]

where we use a generic element $v\in L^{2}(\mathcal{D})$ , with $\|v\|_{L^{2}(\mathcal{D})}=1$ , in place of the eigenvectors $\phi_{k}$ . This $F$ can be thought of as a generic unnormalized KL mode. In addition to the earlier assumptions on $f$ , we also require

[TABLE]

where $b_{j}$ , $j=1,\ldots,N$ , are square integrable.

Lemma 1

Suppose the process $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ satisfies Assumption 1 and (8). Then, $F$ is differentiable and is Lipschitz continuous.

Proof

Differentiability of $F$ can be shown using the standard arguments of differentiating under the integral sign; see e.g., CleavesAlexanderianGuyEtAl19 , where it is shown that under the present set of assumptions, $\frac{\partial F}{\partial\xi_{j}}=\int_{\mathcal{D}}\frac{\partial f}{\partial\xi_{j}}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})v({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}})\,d{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}$ . We also have

[TABLE]

for every ${\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}\in\mathcal{D}$ . Using this, it is straightforward to show, $|F({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{1})-F({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{2})|\leq L\|{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{1}-{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{2}\|$ , for all ${\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{1},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{2}\in\Omega$ , with $L=\left(\sum_{j=1}^{N}\|b_{j}\|_{L^{2}(\mathcal{D})}^{2}\right)^{1/2}$ . ∎

3.2 Gradient computation

The present active subspace-based surrogate modeling approach requires computing the gradient of the output KL modes. Here, it is more convenient to consider the “unnormalized KL modes”, $F_{k}$ , $k=1,\ldots,N$ .

[TABLE]

Finite-difference approximations provide a simple approach, but will be prohibitive when the input dimension ${N_{p}}$ is large and model evaluations are expensive. For field quantities defined in terms of the solution $u$ of a PDE parameterized by $\textstyle{\xi}$ , $F_{k}$ is a functional of $u$ . This is an ideal situation for deploying adjoint based gradient computation. To illustrate this, we assume the model $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ is a function of the solution $u({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ of a PDE. For instance $f$ can be the restriction of $u$ to a sub-domain, or $f$ can be flux of $u$ through a boundary. For illustration, we consider the case where the PDE is of the form $\mathcal{A}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})u=b$ , where $\mathcal{A}$ denotes a differential operator that is parameterized by the uncertain parameter vector $\textstyle{\xi}$ and $b$ is a source term, and assume $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})=u({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . To simplify the presentation further, we consider the discretized problem, where the discretized KL mode is defined by

[TABLE]

Here $\textstyle{u}$ is the discretized state variable, $\mathbf{{W}}$ is a diagonal matrix with quadrature weights on diagonal, ${\mathchoice{\mbox{\boldmath$ \displaystyle{\phi} $}}{\mbox{\boldmath$ \textstyle{\phi} $}}{\mbox{\boldmath$ \scriptstyle{\phi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\phi} $}}}_{k}$ is the discretized $k$ th eigenvector of output covariance, $\mathbf{{A}}$ is the discretized PDE operator, and $\textstyle{b}$ is the discretized source term. Computing the gradient of ${\mathchoice{\mbox{\boldmath$ \displaystyle{F} $}}{\mbox{\boldmath$ \textstyle{F} $}}{\mbox{\boldmath$ \scriptstyle{F} $}}{\mbox{\boldmath$ \scriptscriptstyle{F} $}}}_{k}$ with respect to $\textstyle{\xi}$ can be done using a standard Lagrangian formalism Gunzburger03 . Namely, we define the Lagrangian

[TABLE]

where $\textstyle{q}$ is a Lagrange multiplier. Setting $\frac{\partial\mathcal{L}}{\partial{{{\mathchoice{\mbox{\boldmath$ \displaystyle{q} $}}{\mbox{\boldmath$ \textstyle{q} $}}{\mbox{\boldmath$ \scriptstyle{q} $}}{\mbox{\boldmath$ \scriptscriptstyle{q} $}}}}}}=0$ recovers the state equation, and setting $\frac{\partial\mathcal{L}}{\partial{{{\mathchoice{\mbox{\boldmath$ \displaystyle{u} $}}{\mbox{\boldmath$ \textstyle{u} $}}{\mbox{\boldmath$ \scriptstyle{u} $}}{\mbox{\boldmath$ \scriptscriptstyle{u} $}}}}}}=0$ gives the adjoint equation

[TABLE]

Then, the gradient of ${\mathchoice{\mbox{\boldmath$ \displaystyle{F} $}}{\mbox{\boldmath$ \textstyle{F} $}}{\mbox{\boldmath$ \scriptstyle{F} $}}{\mbox{\boldmath$ \scriptscriptstyle{F} $}}}_{k}$ is given by

[TABLE]

Note that evaluation of $\nabla{\mathchoice{\mbox{\boldmath$ \displaystyle{F} $}}{\mbox{\boldmath$ \textstyle{F} $}}{\mbox{\boldmath$ \scriptstyle{F} $}}{\mbox{\boldmath$ \scriptscriptstyle{F} $}}}_{k}$ requires one state (forward) equation solve and one adjoint equation solve, independently of the dimension of the uncertain parameter vector $\textstyle{\xi}$ . (The forward solves can be reused across the output KL modes.) Thus, to compute $\nabla{\mathchoice{\mbox{\boldmath$ \displaystyle{F} $}}{\mbox{\boldmath$ \textstyle{F} $}}{\mbox{\boldmath$ \scriptstyle{F} $}}{\mbox{\boldmath$ \scriptscriptstyle{F} $}}}_{k}$ , $k=1,\ldots,N$ , we need one solution of the state equation, and $N$ adjoint solves. A small $N$ , is often sufficient for suitable representations of the output field due to the rapid decay of the eigenvalues of the output covariance operator, observed in many applications.

Computing the active subspaces for the output KL modes can be done with a set $N_{s}$ of model evaluations, used across all KL modes. Thus, the computational cost of active subspace discovery for the output KL modes is $N_{s}(1+N)$ , independent of the parameter dimension ${N_{p}}$ . Typically a modest $N_{s}$ is sufficient, as seen in our numerical results. Furthermore, the same set of model evaluations can be used for surrogate model construction for the output KL modes. This enables efficient computation of active subspaces for individual KL modes, at a cost that does not scale with the dimension of $\textstyle{\xi}$ .

Notice that the present illustration uses an equation $\mathbf{{A}}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}){\mathchoice{\mbox{\boldmath$ \displaystyle{u} $}}{\mbox{\boldmath$ \textstyle{u} $}}{\mbox{\boldmath$ \scriptstyle{u} $}}{\mbox{\boldmath$ \scriptscriptstyle{u} $}}}={\mathchoice{\mbox{\boldmath$ \displaystyle{b} $}}{\mbox{\boldmath$ \textstyle{b} $}}{\mbox{\boldmath$ \scriptstyle{b} $}}{\mbox{\boldmath$ \scriptscriptstyle{b} $}}}$ , which is linear in the state variable and nonlinear in the uncertain parameter $\textstyle{\xi}$ . Such an equation can result from discretizing linear (in state) PDEs that are parameterized by uncertain parameters. The adjoint approach can more generally be applied to nonlinear PDE models; see e.g., Gunzburger03 .

In the present work, we use adjoint based gradient computation for computing the gradient of the output KL modes for models governed by elliptic PDEs with a random coefficient function; see section 4.

3.3 Error analysis

The presented computational strategy involves several approximations for computing the active subspace-based surrogate for $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . In this section, we analyze the errors incurred due to (i) truncation of the output KL expansion and (ii) active subspace approximation of the KL modes. For the purposes of the presented analysis, it is more convenient to work with unnormalized KL modes (9), and consider

[TABLE]

The active subspace strategy is flexible regarding the distribution law of the random vector $\textstyle{\xi}$ ; see e.g., Constantine15 . However, in the present work, where we consider models with uncertain coefficient functions that are modeled using log-Gaussian random fields, $\textstyle{\xi}$ is a standard Gaussian random vector, i.e., ${\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\sim\mathcal{N}({\mathchoice{\mbox{\boldmath$ \displaystyle{0} $}}{\mbox{\boldmath$ \textstyle{0} $}}{\mbox{\boldmath$ \scriptstyle{0} $}}{\mbox{\boldmath$ \scriptscriptstyle{0} $}}},\mathbf{{I}})$ .

As detailed in Constantine15 , while Algorithm 1 provides a practical surrogate modeling framework, it is not directly amenable to theoretical analysis. To enable error analysis, following Constantine15 , we define the functions $G_{k}$ , used for active subpace projection of the KL modes in the following way:

[TABLE]

where $\pi_{Y|Z}$ is the conditional density,

[TABLE]

In practice, the marginalized $G_{k}$ , which for convenience we can denote by

[TABLE]

can be approximated by Monte Carlo sampling,

[TABLE]

However, as seen below, even a very small Monte Carlo Sample (even with $N_{\text{AS}}=1$ ) can be acceptable. This partly justifies and explains the effectiveness of Algorithm 1, which can be seen as a special case of (12), with $N_{\text{AS}}=1$ and ${\mathchoice{\mbox{\boldmath$ \displaystyle{z} $}}{\mbox{\boldmath$ \textstyle{z} $}}{\mbox{\boldmath$ \scriptstyle{z} $}}{\mbox{\boldmath$ \scriptscriptstyle{z} $}}}_{1}=0$ .

Recall the approximation of the KL modes $F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\approx G_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{W} $}}{\mbox{\boldmath$ \textstyle{W} $}}{\mbox{\boldmath$ \scriptstyle{W} $}}{\mbox{\boldmath$ \scriptscriptstyle{W} $}}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . As a first step in our error analysis, we quantify the error in this approximation. For each $k\in\{1,\ldots,N\}$ , we define

[TABLE]

where $\{\lambda_{j}(\mathbf{{S}}_{k})\}_{j=1}^{N_{p}}$ are the eigenvalues of $\mathbf{{S}}_{k}$ defined in (6) and $r_{k}$ is the dimension of the active subspace for the $k$ th KL mode $F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . The following result bounds the error in approximating the individual KL modes.

Lemma 2

Let $\delta_{k}$ be as in (13), and let $G_{k}$ and $\hat{G}_{k}$ be as in (11) and (12). Then,

(a)

$\displaystyle\int_{\Omega}(F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})-{G}_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{W} $}}{\mbox{\boldmath$ \textstyle{W} $}}{\mbox{\boldmath$ \scriptstyle{W} $}}{\mbox{\boldmath$ \scriptscriptstyle{W} $}}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}))^{2}\pi({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})d{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\leq\delta_{k}^{2}$ . 2. (b)

$\displaystyle\int_{\Omega}(F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})-\hat{G}_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{W} $}}{\mbox{\boldmath$ \textstyle{W} $}}{\mbox{\boldmath$ \scriptstyle{W} $}}{\mbox{\boldmath$ \scriptscriptstyle{W} $}}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}))^{2}\pi({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})d{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}\leq(1+N_{\text{AS}}^{-1/2})\delta_{k}^{2}$ .

Proof

The KL modes $F_{k}$ are square integrable, mean zero, and by Lemma 1, they are differentiable and Lipschitz continuous. Thus, the first statement follows from using (Constantine15, , Theorem 4.3) and the second one follows from (Constantine15, , Theorem 4.4). ∎

Next, we consider the error, due to active subspace projection (for individual KL modes), in approximating the KL expansion:

[TABLE]

where $f_{N}$ is as in (10), and

[TABLE]

its active subspace-based approximation where ${G}_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{W} $}}{\mbox{\boldmath$ \textstyle{W} $}}{\mbox{\boldmath$ \scriptstyle{W} $}}{\mbox{\boldmath$ \scriptscriptstyle{W} $}}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ is our approximation of the KL modes $F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , as defined before. Then let

[TABLE]

Theorem 3.1

$\mathbb{E}\{\bar{e}\}\leq|\mathcal{X}|^{\frac{1}{2}}\displaystyle\sum_{k=1}^{N}\delta_{k}$ , where $\delta_{k}$ is as in (13).

Proof

See Appendix A. ∎

Remark 1

Note that in view of Lemma 2(b), if we use $\hat{G}_{k}$ defined in (12), instead of $G_{k}$ in (14), we can repeat the argument in proof of Theorem 3.1 to get the following estimate:

[TABLE]

Finally, we consider the overall error of approximating $f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , due to KL truncation and active subspace projection:

[TABLE]

where $f$ is the original QoI and $\hat{f}_{N}$ is the active subspace-based approximation as before. We consider,

[TABLE]

We have the following result:

Theorem 3.2

[TABLE]

Proof

See Appendix A. ∎

We note that the first term in (15) indicates error due to truncation of the output KL expansion. Recall that $\sum_{k=1}^{\infty}\lambda_{k}(C_{f})=\int_{\mathcal{X}}c_{f}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}})\,d{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}<\infty$ , where the equality is due to Mercer’s Theorem, and the finiteness of the integral is due to continuity of the covariance operator, which is a consequence of the mean square continuity assumption on the process. Therefore, the first term in (15) can be made arbitrarily small by taking $N$ sufficiently large. However, choosing a large $N$ could entail accumulation of error due to active subspace projection error in the second term in (15); this error, however, can be controlled by increasing $r_{k}$ . In practice, in many applications, a small number of output KL modes (i.e., a small $N$ ), can be used to obtain an accurate KL representation for the output. Moreover, typically low-dimensional (in many cases one- or two-dimensional) active subspaces can be afforded for approximating the dominant KL modes. We demonstrate these issues numerically in Section 4.

Remark 2

In view of Remark 1, if we use $\hat{G}_{k}$ defined in (12), instead of $G_{k}$ in (14), we can repeat the argument in proof of Theorem 3.2 to get the following estimate:

[TABLE]

Additional sources of error. The above error analysis only concerns errors assiciated with active subspace projection and output KL truncation. In practical computations there are a number of other errors. Most of these errors are related to the active subspace approach used for approximating the KL modes $F_{k}$ . These include errors in approximating the eigenvalues and eigenvectors of $\mathbf{{S}}_{k}$ ’s, incurred due to sample average approximation to these matrices and surrogate modeling errors incurred in approximating $G_{k}$ ’s. Errors in approximating $\mathbf{{\Lambda}}_{k}$ and $\mathbf{{W}}_{k}$ are analyzed for instance in Constantine15 . Errors due to surrogate modeling of $G_{k}$ ’s are difficult to quantify in general, as these errors depend on the choice of surrogate modeling framework.

There are also further errors related to KL approximation of the output field. Namely, we need to approximate the mean field $\mathbb{E}\{f({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},\cdot)\}$ . If simple Monte Carlo sampling is used, approximations of the mean field exhibit the usual Monte Carlo convergence behavior. However, quasi-Monte Carlo approaches can provide efficient means of obtaining more accurate estimates. Finally, there will be errors in computing the eigenvalues $\lambda_{k}(C_{f})$ and the corresponding eigenvectors. These errors are due to (i) sample average approximation to the output covariance function, and (ii) errors due to discretizing the generalized eigenvalue problem (5). The discretization errors of course depend on the numerical method used for solving the eigenvalue problem. For example, if Nystrom’s method is used, as done in the present work, the discretization errors can be controlled by the resolution of the computational grid, and the quadrature method used.

We demonstrate numerically that once a suitable output dimension reduction is determined, the active subspace approach can be deployed to obtain approximations to the output KL modes and an overall surrogate model that captures the statistical properties of $f$ reliably. We find that this can be accomplished with an ensemble of function evaluations $\{f(\cdot,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{j})\}_{j=1}^{N_{s}}$ , with a modest $N_{s}$ .

4 Application to biotransport in tumors

In this section, we present our computational results in the context of a biotransport application problem. We begin by describing the governing model in Section 4.1. Next, we discuss computation of the KL expansion of the output in Section 4.2. This is followed by our results on active subspace discovery and surrogate model construction in Section 4.3. We test the accuracy of the computed surrogate models in Section 4.4.

4.1 The governing model

In this section we describe the biotransport problem we seek to investigate using the proposed method. We are interested in understanding the impact of uncertainty in the material properties of cancerous tumors. Specifically we seek to characterize the uncertainties in the pressure field when a single needle injection occurs at the center of a spherical tumor with uncertain heterogeneous structure. We focus on a 2D cross-section, and consider Darcy’s law constrained by mass conservation in a domain $\mathcal{D}\subset\mathbb{R}^{2}$ given by a circle of radius $R_{\text{tumor}}=5$ mm, centered at the origin, with an inner circle of radius $R_{\text{needle}}=0.25$ mm, modeling the injection site, removed. We denote the inner and outer boundaries of the domain by $\Gamma_{\text{N}}$ and $\Gamma_{\text{D}}$ , respectively. The following elliptic PDE governs the fluid pressure $p$ :

[TABLE]

In this equation, $\kappa$ denotes the absolute permeability field, $\mu$ is the fluid dynamic viscosity, $Q$ is the volume flow rate per unit length, and $\textstyle{n}$ is the outward-pointing normal vector. The nominal values for the above parameters are $\kappa=0.5\ md$ , $\mu=8.9\times 10^{-4}\ Pa\cdot s$ , and $Q=1\ mm^{2}/min$ . These values are chosen according to previous investigations of fluid transport in tumors maher:08 ; ma:12TF ; Chen:07 . As noted in a number of previous works, tumors exhibit complex structures due to their invasive nature. Generally, tumors consist of loosely organized abnormal cells, fibers, vasculature, and lymphatics Clark:91 , resulting in disordered tissues with complex heterogeneous structures.

In the present work, we model the permeability field $\kappa$ by a log-Gaussian random field as follows. Let $z({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},\omega)$ be a centered Gaussian process; here $\omega\in\Omega$ where $\Omega$ is an appropriate sample space. We assume $z$ has unit pointwise variance and has correlation function

[TABLE]

where $\ell>0$ is the correlation length. In the present study we set the correlation length $\ell=1\ mm$ . Then, we define the log-permeability field $a=\log\kappa$ according to

[TABLE]

where $a_{0}$ and $\sigma^{2}_{a}$ represent the pointwise mean and variance, respectively. We can represent $a({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},\omega)$ using a truncated KL expansion:

[TABLE]

where $a_{0}$ is the mean field, $(\lambda_{j}(C_{a}),e_{j})$ are the eigenpairs of the covariance operator $C_{a}$ of $a({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},\omega)$ , ${N_{p}}$ is the input parameter dimension, and $\xi_{j}$ are independent standard normal random variables. With this parameterization, the uncertainty in the (approximate) log permeability field $\hat{a}=\log\kappa$ is completely characterized by the random vector ${\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}=(\xi_{1},\xi_{2},\ldots,\xi_{{N_{p}}})^{T}$ . That is, $\hat{a}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},\omega)=\hat{a}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}(\omega))$ , and thus, we can consider the (approximate) log-permeability field as a random process $\hat{a}:\mathcal{D}\times\Omega\to\mathbb{R}$ . The QoI under study here is the pressure field $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ .

For illustration, two sets of realizations of the permeability field and the corresponding pressure field with correlation length $\ell=1\ mm$ are presented in Figure 1. We observe large fluctuations in the permeability field and relatively mild fluctuations in the pressure field.

4.2 Spectral representation of the model output

We represent the QoI, $p$ , using a truncated KL expansion

[TABLE]

Here $\lambda_{k},\phi_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}})$ are the eigenvalues and corresponding eigenfunctions of the covariance operator $C_{f}$ of $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ .

In Figure 2 (left) we see that the eigenvalues of the covariance operator $C_{f}$ show faster decay than those of the log-permeability field $a(x,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . In Figure 2 (middle) we compute the ratio $\rho_{k}=(\sum_{k=1}^{N}\lambda_{k})/\mathrm{Tr}(C_{f})$ , where $\lambda_{k}$ are the eigenvalues of the covariance operator $C_{f}$ and $N$ is the number of KL modes retained. Using just $N=10$ KL modes gives us $\rho_{k}\approx 0.9$ indicating that $90$ % of the variance is captured. If we add five more modes so that $N=15$ then we capture nearly 95% of the average variance in the model output $p$ . Figure 2 (right) shows the first $40$ eigenvalues of the covariance operator $C_{f}$ , for a number of different sample sizes $N_{s}$ used to approximate the covariance function of $p$ . Note that using $N_{s}=300$ samples we can approximate the dominant eigenvalues reasonably well. In the current study we set ${N_{p}}$ the input parameter dimension to be $200$ and to retain $N=15$ KL modes. From the results that follow we will see that we can achieve significant dimension reduction for the input parameter.

4.3 Active subspace discovery and surrogate model construction

To construct the active subspace-based surrogate model for $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , we need the gradients of the output KL modes defined in (19). For this, we use the adjoint method. The adjoint based expression for $p_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ can be derived using a formal Lagrange approach; see, e.g., Gunzburger03 . For a basic derivation of the gradient of the output KL modes for models governed by elliptic PDEs, we refer the reader to CleavesAlexanderianGuyEtAl19 , where output KL expansions are used within the context of derivative based global sensitivity analysis. The adjoint based expression for the partial derivatives of $p_{k}$ ’s are given by,

[TABLE]

where $p$ is the solution of the (forward) PDE (16) and $q$ is the solution of the adjoint equation:

[TABLE]

Note that in the forward and adjoint equation, we let the permeability field be $\kappa({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})=e^{\hat{a}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})}$ .

Guided by the results in the previous subsection, we focus on the first $N=15$ output KL modes. For each $k\in\{1,\ldots,N\}$ , we generate a sample $\{\nabla p_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})\}_{i=1}^{N_{s}}$ , with $N_{s}=300$ . For $\{{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i}\}_{i=1}^{N_{s}}$ , we used the same set of $300$ parameter samples used in computing the KL expansion of the output. Using (20),

[TABLE]

To compute these, we reuse the model evaluations $\{p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})\}_{i=1}^{N_{s}}$ , from the computation of the output KL expansion earlier; the adjoint variables $q(\cdot,{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})$ are computed by solving the adjoint equation (21), with $\kappa=e^{\hat{a}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})}$ , $i=1,\ldots,N_{s}$ .

Using the gradient samples, we approximate the matrix $\mathbf{{S}}_{k}$ defined in (6) for each KL mode, $k=1,\ldots,N$ :

[TABLE]

In each case, we consider the corresponding spectral decomposition $\mathbf{{\hat{S}}}_{k}=\mathbf{{\hat{W}}}_{k}\mathbf{{\hat{\Lambda}}}_{k}\mathbf{{\hat{W}}}_{k}^{T}$ . To identify the active subspace and where to partition the eigenpairs we examine the spectrum of $\mathbf{{\hat{S}}}_{k}$ . As an illustration, in Figure 3 (top) we present the spectrum for the first three output KL modes. Visually we observe a gap between the first and second eigenvalue for each of the modes indicating one-dimensional active subspaces. In Figure 3 (bottom), we show the corresponding sufficient summary plots (SSPs) for the corresponding modes. A sufficient summary plot here is a scatter plot of the active variables $y=\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}$ versus the output KL modes $p_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . We observe a strong univariate trend which further indicates the presence of one-dimensional active subspaces. To provide a consistent truncation approach, we can use a threshold $\gamma$ and choose the dimension $r_{k}$ of the active subspace according to $\lambda_{k,1}/\lambda_{k,r_{k+1}}>\gamma$ . In the current study we use $\gamma=10$ . Using this approach, we identified a one-dimensional active subspace for each of the first $15$ output modes, expect modes $11$ and $12$ where two-dimensional active subspaces were identified. To illustrate, we report the spectrum of $\mathbf{{S}}_{12}$ and the SSP for $p_{12}$ in Figure 4. Based on the determined values of $r_{k}$ , we partition

[TABLE]

$\mathbf{{\Lambda}}_{k,1}$ contains the dominant eigenvalues and $\mathbf{{W}}_{k,1}$ the corresponding eigenvectors.

Recall that the active subspace approach essentially seeks “important linear combinations” of the input parameters. To illustrate this, in Figure 5, we present the components of the dominant eigenvector for the first three output KL modes $p_{1},p_{2}$ and $p_{3}$ . The smaller inset plot shows the first 50 components of the dominant eigenvector for $p_{1},p_{2}$ and $p_{3}$ . The magnitude of the components give us a sensitivity measure for each of the input parameters. A large component value indicates that that particular input is important in defining the direction of most variation in our function $p$ . We note that the first output KL mode is sensitive to a few components of the input parameter vector. In contrast, the second and especially the third output KL modes show sensitivity to a larger number of components of $\textstyle{\xi}$ .

Finally, we note that the present results indicate a significant dimension reduction. The dominant output KL modes, each a function of 200 parameters, can be approximated in one or two dimensional active subspaces.

Next, we compute the surrogate models for the output KL modes, following the strategy described in Section 3. The surrogate models $\tilde{G}_{k}(\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\approx p_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ , $k=1,\ldots,N$ are constructed by regression fit. Specifically, letting $p_{i,k}=p_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}_{i})$ and $y_{i}=\mathbf{{W}}_{k,1}^{T}{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}$ , $i=1,\ldots,N_{s}$ , we compute least squares approximating polynomials $\tilde{G}_{k}$ that minimize

[TABLE]

In our computations, we found that linear regressions fits were suitable for the dominant output KL modes, except modes $1$ , $11$ , and $12$ , for which we used a quadratic fit. For illustration, we report the computed surrogate models for the first three output modes in Figure 3 (bottom) and for the mode $p_{12}$ in Figure 4 (right). We now have all the pieces to form the surrogate model for the pressure field $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ :

[TABLE]

Below, we examine the accuracy of the computed surrogate model and show its effectiveness in capturing the statistical properties of the pressure field.

4.4 The accuracy of the surrogate model

In this section, we provide various tests of accuracy that examine different aspects of the proposed method. To provide a baseline for comparision, we computed $10,000$ realizations of the exact pressure field and its surrogate model approximation.

We begin by examining the success of the low-rank KL approximation of the pressure field in capturing the variance of the process. The variance of the pressure field can be obtained from $\text{Var}(p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}}))=\mathbb{E}\{p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})^{2}\}-\mathbb{E}\{p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})\}^{2}$ , which we approximate using the computed samples of $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ . The variance field for the truncated KL exansion of $p$ is determined completely by the spectral decomposition of its (approximate) covariance operator:

[TABLE]

where $\lambda_{k}$ and $\phi_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}})$ are the eigenvalues and eigenvectors of the covariance operator $C_{p}$ . Taking the square root we obtain the standard deviation of both the exact pressure field and its approximation; results are shown in Figure 6. We note that for both, the standard deviation is highest at the center of the tumor and decreases to zero as we move away from the center. We also observe that even a low-rank approximation to $p({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ captures the standard deviation of the pressure field well.

Next, we study the accuracy of the computed surrogate model by focusing on the relative error indicator

[TABLE]

where $\hat{p}({\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}},{\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})$ is computed according to (22). In Figure 7 (left), we plot the expected value of $E_{rel}$ as the number of output KL modes increases; the dashed lines indicate the fifth and ninty fifth percentiles. We note that with $N=15$ output KL modes, the relative error is about $5$ % on average. To better understand the behavior of the relative error, we report its distribution in Figure 7 (right), where we used $N=15$ output KL modes in computing $\hat{p}$ . We also show the distribution of the relative error computed over a subdomin $\mathcal{D}^{\prime}=\{{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}\in\mathcal{D}:\|{\mathchoice{\mbox{\boldmath$ \displaystyle{x} $}}{\mbox{\boldmath$ \textstyle{x} $}}{\mbox{\boldmath$ \scriptstyle{x} $}}{\mbox{\boldmath$ \scriptscriptstyle{x} $}}}\|\leq 2\}$ in Figure 7 (right). This is done to quantify the approximation errors near the injection site. We observe that the distribution of the error is shifted to the left, indicating smaller approximation errors in $\mathcal{D}^{\prime}$ (with high probability). To see more clearly how the distribution of the relative error evolves as the number of KL modes increase, we show the probability density function of $E_{rel}$ corresponding to different number of output KL modes in Figure 8. Note that not only does the mode of the distribution get smaller, the spread of the error also decreases, which can be inferred from Figure 7 (left) as well.

In Figure 9, we report the distribution of the average pressure along concentric circles of various radii. This shows that the computed surrogate captures statistical properties of the pressure well in different parts of the domain.

5 Concluding remarks

We have presented a distributed active subspace method for scalable surrogate modeling of PDE-governed physical processes with high-dimensional inputs and function valued outputs. To save the modeling efforts spent on function valued outputs, we employ the truncated KL expansion to decouple the spatial dimensions with those of the random variables. As a result, the randomness in outputs is fully represented by scalar valued KL modes. For elliptic PDEs, as observed in the work, a low-rank KL representation of the model output is usually sufficient for an accurate representation. To reduce the dimension of inputs, we construct active subspaces for each of the dominant output KL modes. Since output gradients with respect to the random variables need to be calculated when constructing active subspaces, we develop an adjoint-based framework to ensure that the computational cost does not scale with the input dimension. The method development is complemented by a rigorous mathematical formulation as well as theoretical analysis of errors due to active subspace projection and output KL representation.

We then deploy the distributed active subspace method to conduct surrogate modeling of the pressure field in a biotransport model in tumors with an uncertain permeability field. We demonstrate that a low-rank representation of the pressure field (i.e., output) can be achieved with the truncated KL expansion. The input (i.e., random variables used to represent the uncertain log-permeability field) dimension can be very high (e.g., several hundreds) when the correlation length of the log-permeability field is small. However, we observe that dominant output KL modes admit low (one or two) dimensional active subspaces. We observe that the average relative error between the surrogate model and the PDE solution can be controlled under 5% even when a very low-rank representation of the outputs is employed. We also show that the surrogate models can capture statistical properties of the pressure well in different parts of the domain.

In future work, we will investigate extensions of the distributed active subspace method to surrogate modeling of the time-dependent diffusion and convection-diffusion processes in biological and geological flows. We envision that the truncated KL expansion can be used to extract spatiotemporal coherent structures from the physical process, and the low-dimensional active subspaces can then be constructed for the scalar valued KL modes associated with these structures. This surrogate modeling approach will contribute to cost-effective forward uncertainty quantification, and facilitate solution of inverse problems under uncertainty, such as Bayesian inversion of tissue material properties from medical images.

Appendix A Proofs

Proof of Theorem 3.1:

Note that,

[TABLE]

Then,

[TABLE]

where we have used Cauchy–Schwarz inequlity and Lemma 2(a). ∎

Proof of Theorem 3.2:

We have

[TABLE]

Now we consider

[TABLE]

Note that $\mathbb{E}\Bigg{(}\sum_{k=N+1}^{\infty}F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})^{2}\Bigg{)}=\sum_{k=N+1}^{\infty}\mathbb{E}\big{(}F_{k}({\mathchoice{\mbox{\boldmath$ \displaystyle{\xi} $}}{\mbox{\boldmath$ \textstyle{\xi} $}}{\mbox{\boldmath$ \scriptstyle{\xi} $}}{\mbox{\boldmath$ \scriptscriptstyle{\xi} $}}})^{2}\big{)}=\sum_{k=N+1}^{\infty}\lambda_{k}(C_{f})$ . Therefore, we have our desired result

[TABLE]

∎

Bibliography44

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Alexanderian, A., Gremaud, P., Smith, R.: Variance-based sensitivity analysis for time-dependent processes. In review (ar Xiv: https://arxiv.org/abs/1711.08030) (2019)
2(2) Alexanderian, A., Reese, W., Smith, R.C., Yu, M.: Model input and output dimension reduction using Karhunen–Loève expansions with application to biotransport. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B: Mechanical Engineering Accepted (2019). https://ui.adsabs.harvard.edu/#abs/2019 ar Xiv 190306314 A
3(3) Alexanderian, A., Zhu, L., Salloum, M., Ma, R., Yu, M.: Investigation of biotransport in a tumor with uncertain material properties using a non-intrusive spectral uncertainty quantification method. J. Biomech. Eng. pp. 091006–1–091006–11 (2017)
4(4) Babuška, I., Nobile, F., Tempone, R.: A stochastic collocation method for elliptic partial differential equations with random input data. SIAM Journal on Numerical Analysis 45 (3), 1005–1034 (2007)
5(5) Clark, W.H.: Tumour progression and the nature of cancer. Br J Cancer 64 , 631–44 (1991)
6(6) Clark, W.H.: Biphasic finite element model of solute transport for direct infusion into nervous tissue. Annals of Biomedical Engineering 35 , 2145––2158 (2007)
7(7) Cleaves, H.L., Alexanderian, A., Guy, H., Smith, R.C., Yu, M.: Derivative-based global sensitivity analysis for models with high-dimensional inputs and functional outputs. ar Xiv e-prints ar Xiv:1902.04630 (2019)
8(8) Constantine, P.: Active Subspaces: Emerging Ideas in Dimension Reduction for Parameter Studies. SIAM, Philadelphia (2015)