An adaptive stochastic Galerkin tensor train discretization for randomly perturbed domains
Martin Eigel, Manuel Marschall, Michael Multerer

TL;DR
This paper introduces an adaptive stochastic Galerkin method using tensor train formats to efficiently solve PDEs on randomly perturbed domains, with error estimation and refinement capabilities.
Contribution
It develops a novel tensor train-based adaptive Galerkin framework for high-dimensional PDEs on random domains, including an a posteriori error estimator for refinement.
Findings
Efficient handling of high-dimensional randomness via tensor train compression.
Successful numerical benchmarks demonstrating accuracy and adaptivity.
Effective error estimation enabling iterative refinement.
Abstract
A linear PDE problem for randomly perturbed domains is considered in an adaptive Galerkin framework. The perturbation of the domain's boundary is described by a vector valued random field depending on a countable number of random variables in an affine way. The corresponding Karhunen-Lo\`eve expansion is approximated by the pivoted Cholesky decomposition based on a prescribed covariance function. The examined high-dimensional Galerkin system follows from the domain mapping approach, transferring the randomness from the domain to the diffusion coefficient and the forcing. In order to make this computationally feasible, the representation makes use of the modern tensor train format for the implicit compression of the problem. Moreover, an a posteriori error estimator is presented, which allows for the problem-dependent iterative refinement of all discretization parameters and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Abstract.
A linear PDE problem for randomly perturbed domains is considered in an adaptive Galerkin framework. The perturbation of the domain’s boundary is described by a vector valued random field depending on a countable number of random variables in an affine way. The corresponding Karhunen-Loève expansion is approximated by the pivoted Cholesky decomposition based on a prescribed covariance function. The examined high-dimensional Galerkin system follows from the domain mapping approach, transferring the randomness from the domain to the diffusion coefficient and the forcing. In order to make this computationally feasible, the representation makes use of the modern tensor train format for the implicit compression of the problem. Moreover, an a posteriori error estimator is presented, which allows for the problem-dependent iterative refinement of all discretization parameters and the assessment of the achieved error reduction. The proposed approach is demonstrated in numerical benchmark problems.
Key words and phrases:
P
artial differential equations with random coefficients, random domain, tensor train, uncertainty quantification, stochastic finite element methods, adaptive methods, ALS, low-rank, reduced basis methods
{AMS}
35R60, 47B80, 60H35, 65C20, 65N12, 65N22, 65J10
1. Introduction
Uncertainties in the data for mathematical models are found naturally when dealing with real-world applications in science and engineering. Being able to quantify such uncertainties can greatly improve the relevance and reliability of computer simulations and moreover provide valuable insights into statistical properties of quantities of interest (QoI). This is one of the main motivations for the thriving field of Uncertainty Quantification (UQ).
In the application considered in this work, the computational domain is assumed as randomly perturbed. This e.g. can be an appropriate model to incorporate production tolerances into simulations and extract statistical information about how such uncertainties get transported through the assumed model. Random domain problems have been examined before, see for instance [31, 2, 20]. Often, sampling approaches are used to evaluate QoI as e.g. has been investigated with a multilevel quadrature for the the domain mapping method in [20]. As an alternative, we propose to employ a stochastic Galerkin FEM (SGFEM) to obtain a functional representation of the stochastic solution on the reference domain, which can then be used to evaluate statistical quantities. For the discretization, a Legendre polynomial chaos basis and first order FE are chosen. The expansion of the perturbation vector field in a (finite) countable sequence of random variables gives rise to a high-dimensional coupled algebraic system, which easily becomes intractable to numerical methods or results in very slow convergence. A way to overcome this problem is to utilize model order reduction techniques. In this work, we make use of the modern tensor train (TT) format [26], which provides an efficient hierarchical tensor representation and is able to exploit low-rank properties of the problem at hand. Another important technique to reduce computational cost is the use of an adaptive discretization. In our case, this is based on a reliable a posteriori error estimator, afforded by the quasi-orthogonal approximation obtained by the SGFEM. With the described error estimator, an iterative adaptive selection of optimal discretization parameters (steering mesh refinement, anisotropic polynomial chaos and tensor ranks) is possible.
For the Karhunen-Loève expansion of the random vector field, we employ the pivoted Cholesky decomposition derived in [18, 19]. The random coefficient and right-hand side which arise due to the integral transformation are tackled with a tensor reconstruction method. All evaluations are carried out in the TT format, which in particular allows for the efficient computation of the error estimator as part of the adaptive algorithm.
The paper is structured as follows: The next section introduces the setting and the required assumptions of the random linear model problem. In particular, a description of the perturbation vector field and the variable transformation is given, converting the random domain problem to a problem with random coefficient and forcing. Section 3 defines the Galerkin finite element discretization of the random coefficient problem in Legendre chaos polynomials. Moreover, the framework for residual based a posteriori error estimation is described. The tensor train format used for the efficient computation of the problem is introduced in Section 4. Section 5 lays out the refinement strategy for the Galerkin method, which is based on the evaluation of a reliable a posteriori error estimate in the tensor representation and an appropriate adaptive algorithm. Numerical examples are discussed in Section 6.
2. Diffusion problems on random domains
In this section, we formulate the stationary diffusion problem on random domains as introduced in [20]. Let denote a complete and separable probability space with -algebra and probability measure . Here, complete means that contains all -null sets. Moreover, for a given Banach space , we introduce the Lebesgue-Bochner space , , which consists of all equivalence classes of strongly measurable functions with bounded norm
[TABLE]
Note that for and a separable Hilbert space, is isomorphic to the tensor product space . We henceforth neglect the dependence on the -algebra to simplify the notation. For an exposition of Lebesgue-Bochner spaces we refer to [21].
In this article, we are interested in computing quantities of interest of the solution to the elliptic diffusion problem
[TABLE]
for -almost every . Note that, the randomness is carried by the open and bounded Lipschitz domain . It is also possible to consider non-trivial diffusion coefficients or boundary data, see e.g. [13] for the treatment of non-homogenous Dirichlet data and [25] for random diffusion coefficients. However, we emphasize that, in order to derive regularity results that allow for the data sparse approximation of quantities of interest, the data have to be analytic functions, cf. [20].
In order to guarantee the well posedness of (1), we assume that all data, i.e. the loading and a possible non-trivial diffusion coefficient, are defined with respect to the hold-all domain
[TABLE]
For the modelling of random domains, we employ the concept of random vector fields. To that end, we assume that there exists a reference domain for with Lipschitz continuous boundary and a random vector field
[TABLE]
such that . In addition, we require that is a uniform -diffeomorphism, i.e. there exists a constant such that
[TABLE]
In particular, since {{\boldsymbol{V}}}\in L^{\infty}\big{(}\Omega;C^{1}(\overline{D_{\operatorname{ref}}})\big{)}\subset L^{2}\big{(}\Omega;C^{1}(\overline{D_{\operatorname{ref}}})\big{)}, the random vector field exhibits a Karhunen-Loève expansion of the form
[TABLE]
Herein, the expectation is given in terms of the Bochner integral
[TABLE]
Note that here and henceforth, we denote as material coordinates, in contrast to spatial coordinates . In particular, there holds for some . The anisotropy, which is induced by the spatial contributions , describing the fluctuations around the nominal value , is encoded by
[TABLE]
In our model, we shall also make the following common assumptions. {assumption}
- (i)
The random variables take values in .
- (ii)
The random variables are independent and identically distributed.
- (iii)
The sequence is at least in .
In view of this assumption, the Karhunen-Loève expansion (3) can always be computed if the expectation and the matrix-valued covariance function
[TABLE]
are known. Herein,
[TABLE]
denotes the centered random vector field. The Karhunen-Loève expansion is based on the spectral decomposition of the integral operator associated to the covariance function, which can be computed efficiently by means of the pivoted Cholesky decomposition, if the covariance function is sufficiently smooth, cf. [18, 19].
By an appropriate reparametrization, we can always guarantee that
[TABLE]
Moreover, if we identify the random variables by their image , we end up with the representation
[TABLE]
For later reference, we also introduce the push-forward measure on , which will be assumed as a tensor product measure , where is a probability measure on .
The Jacobian of with respect to the material coordinate is given by
[TABLE]
Introducing the parametric domains , i.e.
[TABLE]
we may now introduce the model problem transported to the reference domain which reads for every :
[TABLE]
Herein, we have
[TABLE]
and
[TABLE]
{remark}
The uniformity condition in (2) implies that the functional determinant in (7) is either uniformly positive or negative, see [20] for the details. We shall assume without loss of generality and hence , i.e. we may just drop the modulus. More precisely, due to (2), we can bound the determinant according to
[TABLE]
for every and almost every . In addition, all singular values of are bounded from below by and from above by . From this, we obtain the bound
[TABLE]
for every and almost every . Hence, the transported model problem is uniformly elliptic.
We conclude this section by summarizing the regularity results for , cp. (6), with respect to the parameter from [20]. For this, denote by the set of finitely supported multi-indices
[TABLE]
{theorem}
Let the right-hand side from (1) satisfy
[TABLE]
for some constants and . Then, for every it holds
[TABLE]
for some constants , which depend on but are independent of the multi-index .
3. Adaptive Galerkin discretisation
In this section we describe the Galerkin discretization of the considered random PDE (6) in a finite dimensional subspace . Determined by the elliptic problem type with homogeneous boundary condition, we assume is discretized by a first order Lagrange FE basis on a mesh representing . Moreover, the randomness is modelled in a truncated version of and represented by Legendre chaos polynomials orthonormal with respect to the joint probability measure associated with the parameter . Consequently, with norm is spanned by the respective tensor basis. Moreover, the residual based a posteriori error estimator of [5, 6] is recalled for the problem at hand.
For efficient computations of the Galerkin projection and the error estimator, the resulting system with inhomogeneous coefficient and right-hand side (7) is represented in the tensor train format as presented in Section 4.
3.1. Parametric and deterministic discretization
To determine a multivariate polynomial basis of , we first define the full tensor index set of order and maximal degree by
[TABLE]
For any such subset , we define . Let denote a basis of , orthogonal with respect to the Lebesgue measure, consisting of Legendre polynomials of degree n\in\mathbb{N}_{0}\ on . Moreover, to obtain a finite dimensional setting, we define the truncated parameter domain and probability measure . By tensorization of the univariate polynomials, an orthogonal basis of is obtained. Then, for any multi-index , the tensor product polynomial in is expressed as the finite product
[TABLE]
Assuming that , after suitable rescaling we can consider as an orthonormal basis of , where denotes the Lebesgue measure and hence is the uniform measure on , see [28].
A discrete subspace of is given by the conforming finite element space of degree on some simplicial regular mesh of the domain with the set of faces (i.e. edges for ) and basis functions . For a convenient presentation, we denote the piecewise constant basis functions of by , where is the number of elements in . In order to circumvent complications due to an inexact approximation of boundary values, we assume that is a polytope. By denoting the space of piecewise polynomials of degree on , the assumed FE discretization with Lagrange elements then satisfies . For any element and face , we set the entity sizes and . Let denote the exterior unit normal on any face . The jump of some on in normal direction is then defined by
[TABLE]
By and we denote the element and facet patches defined by the union of all elements which share at least a vertex with or , respectively. Consequently, the Clément interpolation operator satisfies, respectively for and ,
[TABLE]
where the seminorms and are the restrictions of to and ,
The fully discrete approximation space subject to some mesh with FE order and active set with is given by
[TABLE]
and it holds . We define a tensor product interpolation operator for by setting
[TABLE]
For and all , it holds
[TABLE]
3.2. Random field discretisation
In this paragraph, we highlight the special structure of the random field discretization. We aim at an efficient way to discretize the transformed and parametrized random fields (7) in terms of the piecewise constant finite element functions , pointwise for every .
In [25], it has been shown how the random vector field (5) can efficiently be approximated by means of finite elements. This results in a truncated representation with terms of the form
[TABLE]
where denotes the canonical basis of , is a basis for and are the coefficients in the basis representation of . The length of this expansion depends on the desired approximation error of the random field, which can be rigorously controlled in terms of operator traces, see [19, 29].
For the corresponding Jacobian, we obtain
[TABLE]
More explicitly, the Jacobians are given by
[TABLE]
Since , , are piecewise constant functions, we can represent in an element based fashion according to
[TABLE]
where denotes the characteristic function on the element and are the corresponding coefficients. Hence, we end up with a piecewise constant representation of , which reads
[TABLE]
From this representation, it is straightforward to calculate for a given , also in an element based fashion. Having , , at our disposal, it is then easy to evaluate and , as well.
This procedure can be extended to the general case of order ansatz functions for the random vector field (5), resulting in an order approximation of the desired quantities in (7).
3.3. Variational formulation
Using the transformation in (7), the weak formulation of the model problem (6) reads: find , such that for all there holds
[TABLE]
This characterizes the operator , which gives rise to the energy norm . Employing the finite dimensional spaces of the previous section leads to the discrete weak problem: find , such that for all and
[TABLE]
Here, we define the discrete linear operator
[TABLE]
and the discrete right-hand side
[TABLE]
3.4. Residual based a posteriori error estimates
In the following, we recall the residual based a posteriori error estimator derived in [5, 6], adopted for the problem at hand. An efficient reformulation in the tensor train format is postponed to Section 4. The basis for the estimator is the residual with respect to some and the solution of (6) given by
[TABLE]
It has an -convergent expansion in given by
[TABLE]
with coefficients characterized by
[TABLE]
Here, , and denote the coefficients in the Legendre chaos expansion of , and and
[TABLE]
is the -relevant triple product tuple set.
We recall a central theorem from [5], which enables the derivation of an error bound based on an approximation of the Galerkin projection of the solution in the energy norm. {theorem}
Let be a closed subspace and , and let denote the Galerkin projection of onto . Then, for some , it holds
[TABLE]
{remark}
The constant is related to the Clément interpolation operator in and stems from the spectral equivalence such that . We refer to [5] for further details.
{remark}
We henceforth assume that the data and are exactly expanded in a finite set with , i.e. with the approximation, there is no significant contribution from the neglected modes . The residual can then be split into approximation and truncation contributions
[TABLE]
where denotes the restriction of the expansion to the set . Computable upper bounds for the two residual terms and the algebraic error are recalled in the following.
For any discrete , we define the following error estimators in analogy to the presentation in [5, 6] and [8]:
- •
A deterministic residual estimator for steering the adaptivity of the mesh is given by
[TABLE]
with volume contribution for any
[TABLE]
and facet contribution for any
[TABLE]
- •
The stochastic truncation error estimator stems from splitting the residual in (18), while considering the inactive part over . It is possible to construct the estimator, as in the deterministic case, for every element of the triangulation and consider different mesh discretisations for every stochastic multi-index. Since we want to focus on a closed formulation and avoid technical details, the stochastic estimator is formulated on the whole domain . Nevertheless, for more insight we introduce a collection of suitable tensor sets, which indicate the error portion of every active stochastic dimension (in fact, we could even consider ),
[TABLE]
Then, for every , the stochastic tail estimator on is given by
[TABLE]
where we define for every multi index the residual portion
[TABLE]
The collection of sets is beneficial in the adaptive refinement procedure but it does not cover the whole stochastic contributions of the residual. For this, we need to compute the global stochastic tail estimator over
[TABLE]
which incorporates an infinite sum that becomes finite due to remark 3.4.
- •
The algebraic error denotes the distance of to the best approximation . In particular, this distance can e.g. occur due to an early termination of an iterative solver or an restriction to another solution manifold . This error can be bounded by
[TABLE]
where
[TABLE]
Here, denotes the coefficient tensor of , is the discrete operator from (16) and is the Frobenius norm. Note that the rank-1 operator is a base change operator to orthonormalize the physical basis functions, i.e.
[TABLE]
The combination of these estimators in the context of Theorem 3.4 yields an overall bound for the energy error similar to the references [5, 6, 10] and [8] {corollary}
For any , the solution of the model problem (1) and the Galerkin approximation in (15), there exists constants such that it holds
[TABLE]
{remark}
Observing the residual in (17) it becomes clear that the derived error estimators suffer from the “curse of dimensionality” and are hence not computable for larger problems. However, the hierarchical low-rank tensor representation introduced in the next section alleviates this substantial obstacle and makes possible the adaptive algorithms described in Section 5.
4. Tensor trains
The inherent tensor structure of the involved Bochner function space and the corresponding finite dimensional analogue motivates the use of hierarchical tensor formats which aim at an implicit model order reduction, effectively breaking the curse of dimensionality in case of low-rank approximability of operator and solution.
A representative can be written as
[TABLE]
where is a high dimensional tensor containing for example the projection coefficients
[TABLE]
Setting , the storage cost of is , which grows exponentially with the number of dimensions in the stochastic parameter space. To alleviate this major problem for numerical methods, we impose a low-rank assumption on the involved objects and introduce a popular tensor format as follows.
A tensor is called in tensor train (TT) format if every entry can be represented as the result of a matrix-vector multiplication of the form
[TABLE]
To simplify notation, set . If the vector is minimal in some sense, we call the TT-rank and (30) is the TT-decomposition of . It can be observed that the complexity of now depends only linearly on the number of dimensions, namely . In [27, 24] it was shown that many functions in numerical applications admit a low-rank representation.
Given the full tensor description of , one could compute the tensor train representation by a hierarchical singular value decomposition (HSVD) as described in [14]. However, this is usually unfeasible due to the high dimensionality of or because it is known only implicitly. In that case, the utilization of high dimensional interpolation or regression algorithm is advisible, see e.g. [26, 15].
In this work, we rely on a TT reconstruction approach and employ it to obtain the representation of the transformed coefficient function and the right-hand side (7). Opposite to an explicit (intrusive) discretisation of the linear system in tensor format as e.g. carried out in [8, 23], the reconstruction method relies on a set of random samples of the solution. The non-intrusive algorithm used in the numerical experiments is described in [9]. Similar ideas were presented in [26, 3, 4], where a tensor cross approximation was used for the construction of the algebraic system. In contrast to the tensor reconstruction, a selective sampling of strides in the tensor has to be available to perform a cross approximation. Consider [16] for a survey on the topic of low-rank approximations methods.
To sketch the reconstruction approach, we assume a set of parameter realisations and corresponding measurements of a function
[TABLE]
Recall that is the dimension of the finite element space. We define a linear measurement operator acting on a tensor by
[TABLE]
with a contraction over the stochastic modes and
[TABLE]
The reconstruction problem is to find a tensor with minimal TT-rank such that . Details in particular of the numerical solution algorithm of the optimisation problem by an Alternating Steepest Descent (ASD) can be found in [9].
4.1. Galerkin discretization in tensor train format
In the following, we assume an acessible tensor representations of the right-hand side and the coefficient function . To make this more precise, we denote the low-rank approximations of e.g. in (7) by
[TABLE]
where admits a TT representation of rank and is a tensor multi-index set with local dimension cap . Analogously, every component of the symmetric matrix coefficient
[TABLE]
is approximated by , as in (31) with TT-ranks . Here, the order three component tensors in the TT-representation of the approximated matrix entry are denoted by .
{remark}
Since, for the coefficient, the TT reconstruction is carried out for every matrix entry in (32), the local dimensions and tensor ranks can vary among those tensor trains. Here, we assume that every approximation has the same local dimensions and the tensor multi-index set covering those indices is denoted by , possibly different from (but larger than) the solution active set . As stated in [8], it is beneficial (and in fact necessary) to chose such that for all also . Due to the orthogonality of the polynomial basis , this feature guarantees a well-posed discrete problem since additional approximations are avoided and enables quasi-optimal convergence rates of the Galerkin method.
On , the Galerkin operator resulting from the transformed weak problem in TT format is given as the sum of TT operators such that for all , and ,
[TABLE]
each corresponding to one addend of the resulting matrix-vector product in (16).
In the following, we illustrate the explicit construction of the TT operator for the term . By denoting the -th component of the gradient of a function , for the first low-rank approximated bilinear form addend one obtains
[TABLE]
Using the multi-linear structure of , one can write as
[TABLE]
where the first component tensor depends on the physical discretization in piecewise constant FE functions only, i.e.,
[TABLE]
The remaining tensor operator parts decompose into one dimensional integrals over triple products of orthogonal polynomials of the form
[TABLE]
The evaluation is known explicitly thanks to the recursion formula for orthogonal polynomials, cf. [1, 11].
{remark}
Due to the sum of TT operators in (33), the result can be represented by a tensor with TT-rank: .
With the TT approximations of the data and , we replace the original system of equations that have to be solved for , namely
[TABLE]
with a constrained minimization problem on the low-rank manifold containing all tensor trains of dimensionality represented by and fixed rank ,
[TABLE]
Here, we take and as the TT approximations of and , respectively and is the Frobenius norm.
To solve (39), we chose a preconditioned alternating least squares (ALS) algorithm as described in [22, 10]. This eventually results in an approximation of the Galerkin solution of (15)
[TABLE]
where is a place-holder for the inscribed parameters of the numerical algorithm and is the desired and predefined TT-rank of .
5. Adaptive algorithm
The error estimator of Section 3.4 is formulated in a computable TT representation in Section 5.1. It gives rise to an adaptive algorithm, which refines the spatial discretization, the anisotropic stochastic polynomial set and the representation format iteratively based on local error estimators and indicators. This enables the assessment of the development of the actual (unknown) error . The inherently high computational cost of the error estimators can be overcome by means of the tensor train formalism. In what follows, we examine the efficient computation of the individual error estimator components in the TT format and describe the marking and refinement procedure. For more details and a more general framework, we refer to the presentations in [5, 6, 10, 8].
5.1. Efficient computation of error estimators
We illustrate the efficient computation of the deterministic error estimator . For each element of the triangulation, the error estimator is given by (20). Due to the sum over it suffers from the curse of dimensionality. However, employing the low-rank approximation , and renders the computation feasible. To make this more explicit, recall that
[TABLE]
This is evaluated by expansion of the inner product,
[TABLE]
The first term is a simple inner product of a functional tensor train. It reduces to a simple summation over the tensor components due to the orthonormality of the polynomial basis, i.e.,
[TABLE]
whereas the high-dimensional sum can be evaluated for every tensor dimension in parallel using, for all , that
[TABLE]
Note that the iterated sum over the tensor ranks has to be interpreted as matrix-vector multiplications. Hence, the formula above can be evaluated highly efficiently. In fact, if the employed TT format utilizes a component orthogonalization and is left-orthogonal, the product can be neglected and one only has to sum over and .
For the remaining terms in (42), one has to find a suitable representation of . Since the gradient is a linear operator, one can calculate a tensor representation of this product explicitly, involving multiplied ranks and doubled polynomial degrees. For a detailed construction we refer to [8, Section 5]. The matrix-vector multiplication due to entry-wise TT representation of does not impose any further difficulties but a slight increase in complexity since one needs to cope with a sum of individual parts. Eventually, the mixed and operator terms are computed in the same fashion, using the same arguments as for (45).
5.2. Fully adaptive algorithm
Given an initial configuration consisting of a regular mesh , a finite active tensor multi-index set , a (possibly random) start tensor with TT-rank and solver parameter , consisting e.g. of a termination threshold, rounding parameter, iteration limit or precision arguments, we now present the adaptive refinement procedure summarized in Algorithm 11.
On every level, we generate an approximation of the data and by a tensor reconstruction. The procedure is e.g. described in [9] and referred to as
[TABLE]
where the multi-index set can be chosen arbitrarily, but it is advisable to consider Remark 4.1. The number of samples can be related e.g. to Monte Carlo samples or more structured quadrature techniques such as Quasi Monte Carlo and sparse grid points. In what follows we assume that the obtained approximations become sufficiently accurate.
The procedure for obtaining a numerical approximation is denoted by
[TABLE]
The used preconditioned ALS algorithm is only exemplary to obtain . Alternative alternating methods or Riemannian algorithms are feasible as well.
To obtain the overall estimator , one has to evaluate the individual components by the following methods
[TABLE]
A weighted balancing of the global estimator values and results in the marking and refinement decision.
5.2.1. Deterministic refinement
In case of a dominant deterministic error estimator , one employs a Dörfler marking strategy on the mesh for a ratio constant . In abuse of notation, we use as the local error estimator on every triangle, where the jump components of are distributed among their nearby elements. The method, consisting of the marking process and the conforming refinement of the marked triangles is covered by
[TABLE]
5.2.2. Stochastic refinement
In case of a dominant stochastic error estimator , we apply a Dörfler marking on the set of local estimators until the prescribed ratio is reached. The marked dimensions in are increased by the method
[TABLE]
{remark}
As stated in Section 3.4, the global estimator is not just the sum of the individual estimators since the coupling structure is more involved. Hence, we use in the marking procedure. Due to the high regularity of the solution (Theorem 2), for large enough, one has . Note that in the finite dimensional noise case, we have for .
5.2.3. Representation refinement
In the end, if has the largest contribution in the error estimator we improve the accuracy of the iterative solver. For simplicity, we fix most of the solver parameter such as the number of alternating iteration or the termination value to low values that can be seen as overcautious. Nevertheless, in the low-rank tensor framework, the model class is restricted by the TT-rank . Hence, we then allow and add a random rank 1 tensor onto the solution tensor . We summarize this approach in
[TABLE]
5.2.4. Adaptive algorithm
One global iteration of this algorithm refines either the deterministic mesh , the active stochastic polynomial index-set or the tensor rank . Iteration until the defined estimator in Corollary 3.4 falls below a desired accuracy yields the adaptively constructed low-rank approximation .
{algorithm}
0: Initial guess with solution coefficient ; solver accuracy ; mesh with degrees ; active index set ;sample size for reconstruction; Dörfler marking parameters and ; desired estimator accuracy .
0: New solution with new solution coefficient ; new mesh , or new index set , or new tolerance .
repeat
[TABLE]
if then
else if then
else
end if
until
return Reconstruction based adaptive stochastic Galerkin method
6. Numerical examples
This section is concerned with the demonstration of the performance of the described Galerkin tensor discretisation and the adaptive algorithm depicted in the preceding section. We consider the linear second order model problem with a constant right-hand side and homogeneous Dirichlet boundary conditions
[TABLE]
on two different reference domains in , namely the unit circle and the L-shape. The Karhunen-Loève expansion of the random vector field stems from a Gaussian covariance kernel of the form
[TABLE]
The random variables in the Karhunen-Loève expansion are assumed to be independent and uniformly distributed on , i.e. they have normalized variance. Moreover, the mean is given by the identity, i.e. .
The computed spectral decomposition is truncated at a given threshold , which takes different values in the computational examples. Table 1 summarizes how the choice of the truncation parameter affects the number of involved stochastic dimensions.
We are interested in the correct approximation of the solution mean
[TABLE]
and solution variance
[TABLE]
by means of the adaptive low-rank Galerkin approximation. In order to verify this, all experiments involve the computation of a reference mean and variance, based on a sampling approach. To that end, we employ the anisotropic sparse grid quadrature with Gauss-Legendre points111The implementation can be found online: https://github.com/muchip/SPQR, as described in [17]. The corresponding moments are then calculated on a fine reference mesh, resulting from uniform refinement of the last, adaptively computed, mesh , having at least degrees of freedom. All experiments involve linear finite element spaces, i.e. with . The number of quadrature points is chosen differently for the problems at hand. For we take adaptively chosen nodes. Benchmarking the resulting mean from the sparse quadrature for this choice of samples against an approximation with additional nodes does not significantly improve the approximation quality (data not shown). The same arguments apply for and nodes, as well as for and nodes. We denote the reference mean as and the reference variance as .
{remark}
In the low-rank tensor train format, the mean of a function, given in orthonormal polynomials, is computed highly efficiently, since the set of employed polynomials is orthonormal with respect to the constant function. Since, the corresponding coefficient is already incorporated in the representation, computing the mean is a simple tensor evaluation. More precisely, given we compute
[TABLE]
Here, the evaluation of the tensor train at the multi-index consists of matrix-vector multiplications.
Similarly for the variance, we can compute the second moment as
[TABLE]
This computation reduces even further, since in the tensor train setting of , with and for it is common to impose left-orthogonality, i.e. . Hence, the second moment reads
[TABLE]
where it is advisable to not evaluate the matrix-matrix product over , since the resulting matrix is usually not sparse and exhibits available memory resources.
The considered quantity of interest is the error of the mean and variance to the corresponding reference. Therefore, for any TT approximation using remark 6, we compute the relative error of the mean in norm
[TABLE]
and the relative error of the variance in norm
[TABLE]
Furthermore, we evaluate the rate of convergence of for the adaptive and uniform refinement, as well as the rate of the estimator . This is done by fitting the function to the individual values, with respect to the number of degrees of freedom in the physical mesh. We denote the corresponding rates , and respectively.
The employed tensor reconstruction algorithm is implemented in the open-source library xerus [30]. Every such approximated tensor is constructed on a set of samples as in the computation of the reference mean and polynomial degrees that are determined by the solution approximation as described in remark 4.1. In the considered examples, the tensor train solution employs constant and linear polynomials in all dimensions only. This behaviour is based on the stochastic estimator and reasoned in the complexity of the mesh approximation. For the assembling of the physical part of the bilinear and linear form and the evaluation of sample solutions we make use of the PDE library FEniCS [12]. The entire adaptive stochastic Galerkin method is implemented in the open source framework ALEA [7].
6.1. Example 1
The first example is the random domain problem on the unit circle. We use this problem as a reference, since the adaptive refinement is expected to yield similar results to uniform mesh refinement. Starting with an initial configuration of cells, fixed polynomial degree in the stochastic space of and tensor rank , the described adaptive Galerkin FE algorithm yields the adaptively refined mesh depicted in Figure 2.
For illustration purposes, we show the mean and variance of the solution on the unit disc together with realisations of the transformed reference domain for the adapted discretisation in Figures 4 and 5. In Table 3 we show the corresponding rates of convergence and the minimal reached error and . The maximal involved tensor ranks are displayed in column . The degrees of freedom are shown for the physical mesh in column m-dofs and for the tensor train itself in column tt-dofs. Note that the tensor train degrees of freedom refer to the dimension of the corresponding low-rank manifold. As expected for the unit circle, the adaptive refinement does not improve the already optimal convergence rate.
6.2. Example 2
For the second example, we chose the L-shaped domain . The corner singularity is a typical example where adaptive refinement yields better approximation rates with respect to degrees of freedom than a uniform refinement.
Starting with an initial configuration of cells, fixed polynomial degree in the stochastic space of and tensor rank , the described adaptive Galerkin FE algorithm yields the adaptively refined mesh displayed in Figure 7. Again, we show the approximated mean and variance together with some random realizations of the transformed reference domain in Figures 6 and 8.
The obtained rates of convergence and error quantities are shown in Table 9. Fortunately, the obtained rate for the estimator follows the error rate , in contrast to the uniform refinement strategy, where a slightly slower convergence is achieved.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables . Applied mathematics series. Dover Publications, N. Chemsford, MA, 1964.
- 2[2] J. E. Castrillon-Candas, F. Nobile, and R. F. Tempone. Analytic regularity and collocation approximation for elliptic PD Es with random domain deformations. Computers & Mathematics with Applications , 71(6):1173–1197, 2016.
- 3[3] Sergey Dolgov, Boris N Khoromskij, Alexander Litvinenko, and Hermann G Matthies. Polynomial chaos expansion of random coefficients and the solution of stochastic partial differential equations in the tensor train format. SIAM/ASA Journal on Uncertainty Quantification , 3(1):1109–1135, 2015.
- 4[4] Sergey Dolgov and Robert Scheichl. A hybrid alternating least squares–tt cross algorithm for parametric pdes. ar Xiv preprint ar Xiv:1707.04562 , 2017.
- 5[5] Martin Eigel, Claude Jeffrey Gittelson, Christoph Schwab, and Elmar Zander. Adaptive stochastic Galerkin FEM. Comput. Methods Appl. Mech. Engrg. , 270:247–269, 2014.
- 6[6] Martin Eigel, Claude Jeffrey Gittelson, Christoph Schwab, and Elmar Zander. A convergent adaptive stochastic Galerkin finite element method with quasi-optimal spatial meshes. ESAIM: Mathematical Modelling and Numerical Analysis , 49(5):1367–1398, 2015.
- 7[7] Martin Eigel, Robert Gruhlke, Manuel Marschall, Philipp Trunschke, and Elmar Zander. Alea - a python framework for spectral methods and low-rank approximations in uncertainty quantification.
- 8[8] Martin Eigel, Manuel Marschall, Max Pfeffer, and Reinhold Schneider. Adaptive stochastic galerkin fem for lognormal coefficients in hierarchical tensor representations. ar Xiv preprint ar Xiv:1811.00319 , 2018.
