Large-scale Inversion of Subsurface Flow Using Discrete Adjoint Method

Shu Wang; Satish Karra; Daniel O'Malley

arXiv:1906.01132·physics.comp-ph·June 5, 2019

Large-scale Inversion of Subsurface Flow Using Discrete Adjoint Method

Shu Wang, Satish Karra, Daniel O'Malley

PDF

Open Access

TL;DR

This paper demonstrates the implementation of a parallel discrete adjoint sensitivity analysis method to efficiently solve large-scale subsurface flow inversion problems, significantly reducing computational costs compared to traditional approaches.

Contribution

The paper introduces a parallel implementation of the discrete adjoint method for large-scale subsurface flow inversion, enhancing computational efficiency for high-dimensional parameter spaces.

Findings

01

Efficient sensitivity analysis reduces computational cost.

02

Parallel implementation scales well with problem size.

03

Applicable to realistic heterogeneous subsurface models.

Abstract

Sensitivity analysis plays an important role in searching for constitutive parameters (e.g. permeability) subsurface flow simulations. The mathematics behind is to solve a dynamic constrained optimization problem. Traditional methods like finite difference and forward sensitivity analysis require computational cost that increases linearly with the number of parameters times number of cost functions. Discrete adjoint sensitivity analysis (SA) is gaining popularity due to its computational efficiency. This algorithm requires a forward run followed by a backward run who involves integrating adjoint equation backward in time. This was done by doing one forward solve and store the snapshot by checkpointing. Using the checkpoint data, the adjoint equation is numerically integrated. The computational cost of this algorithm only depends on the number of cost functions and does not depend on the…

Figures36

Click any figure to enlarge with its caption.

Equations74

ϵ_{1} y^{T} y \leq y^{T} D (x) y \leq ϵ_{2} y^{T} y, x \in Ω, \forall y \in R^{nd}

ϵ_{1} y^{T} y \leq y^{T} D (x) y \leq ϵ_{2} y^{T} y, x \in Ω, \forall y \in R^{nd}

\frac{\partial}{\partial t} (ρϕ) - \nabla \cdot (\frac{ρ k}{μ} \nabla p) = Q_{m},

\frac{\partial}{\partial t} (ρϕ) - \nabla \cdot (\frac{ρ k}{μ} \nabla p) = Q_{m},

ρϕβ \frac{\partial p}{\partial t} - \frac{1}{g} \nabla \cdot (K \nabla p) = Q_{m},

ρϕβ \frac{\partial p}{\partial t} - \frac{1}{g} \nabla \cdot (K \nabla p) = Q_{m},

\overset{u}{˙} (x, t)

\overset{u}{˙} (x, t)

u (x, t)

- \hat{n} \cdot D (x) \nabla u (x, t)

u (x, 0)

x min

x min

F (u, p) = 0,

x min

x min

F (t, u, \overset{u}{˙}, p) = 0, 0 \leq t \leq T

u (0) = u_{0} (p)

d_{p} Ψ_{i} (u, p) = d_{p} Φ_{i} (u_{T}, p) + \int_{0}^{T} [\partial_{u} r_{i} d_{p} u + \partial_{p} r_{i}] d t

d_{p} Ψ_{i} (u, p) = d_{p} Φ_{i} (u_{T}, p) + \int_{0}^{T} [\partial_{u} r_{i} d_{p} u + \partial_{p} r_{i}] d t

L_{i} = Φ_{i} (u_{T}, p) + \int_{0}^{T} [r_{i} + ν_{i}^{T} F (t, u, \overset{u}{˙}, p)] d t + μ_{i}^{T} [u (0) - u_{0} (p)],

L_{i} = Φ_{i} (u_{T}, p) + \int_{0}^{T} [r_{i} + ν_{i}^{T} F (t, u, \overset{u}{˙}, p)] d t + μ_{i}^{T} [u (0) - u_{0} (p)],

d L_{i}

d L_{i}

+ μ_{i}^{T} [d_{p} u (0) - \partial_{p} u_{0} (p)]

= d_{p} Φ_{i} (u_{T}, p) + \int_{0}^{T} {[\partial_{u} r_{i} + ν_{i}^{T} (\partial_{u} F - d_{t} \partial_{\overset{u}{˙}} F) - \overset{ν_{i}}{˙}^{T} \partial_{\overset{u}{˙}} F] d_{p} u + \partial_{p} r_{i}

+ ν_{i}^{T} \partial_{p} F} d t + ν_{i}^{T} \partial_{\overset{u}{˙}} F d_{p} u ∣_{T} + (μ_{i}^{T} - ν_{i}^{T} \partial_{\overset{u}{˙}} F) ∣_{0} d_{p} u (0) - μ_{i}^{T} \partial_{p} u_{0} (p),

\partial_{u} r_{i} + ν_{i}^{T} (\partial_{u} F - d_{t} \partial_{\overset{u}{˙}} F) - \overset{ν_{i}}{˙}^{T} \partial_{\overset{u}{˙}} F = 0

\partial_{u} r_{i} + ν_{i}^{T} (\partial_{u} F - d_{t} \partial_{\overset{u}{˙}} F) - \overset{ν_{i}}{˙}^{T} \partial_{\overset{u}{˙}} F = 0

M \overset{u}{˙} (t) = f (t, u (t)), u (0) = β

M \overset{u}{˙} (t) = f (t, u (t)), u (0) = β

M u_{n + 1} = M u_{n} + Δ t f (t_{n + 1}, u_{n + 1})

M u_{n + 1} = M u_{n} + Δ t f (t_{n + 1}, u_{n + 1})

M S_{l, n} = M S_{l, n} + Δ t (f_{u} (t_{n + 1}, u_{n + 1}) S_{l, n + 1} + f_{p} (t_{n + 1}, u_{n + 1}))

M S_{l, n} = M S_{l, n} + Δ t (f_{u} (t_{n + 1}, u_{n + 1}) S_{l, n + 1} + f_{p} (t_{n + 1}, u_{n + 1}))

Φ = ϕ (u (T); p) .

Φ = ϕ (u (T); p) .

L = ϕ (u_{N}) - ν_{0}^{T} (u_{0} - β) n = 0 \sum N - 1 ν_{n + 1}^{T} (u_{n + 1} - N (u_{n}))

L = ϕ (u_{N}) - ν_{0}^{T} (u_{0} - β) n = 0 \sum N - 1 ν_{n + 1}^{T} (u_{n + 1} - N (u_{n}))

\frac{\partial L}{\partial p} = ν_{0}^{T} \frac{\partial β}{\partial p} - (\frac{\partial ϕ ( u _{N} )}{\partial u} - ν_{N}^{T}) \frac{\partial u _{N}}{\partial p} - n = 0 \sum N - 1 (ν_{n}^{T} - ν_{n + 1}^{T} \frac{\partial N ( u _{n} )}{\partial u}) \frac{\partial u _{n}}{\partial p}

\frac{\partial L}{\partial p} = ν_{0}^{T} \frac{\partial β}{\partial p} - (\frac{\partial ϕ ( u _{N} )}{\partial u} - ν_{N}^{T}) \frac{\partial u _{N}}{\partial p} - n = 0 \sum N - 1 (ν_{n}^{T} - ν_{n + 1}^{T} \frac{\partial N ( u _{n} )}{\partial u}) \frac{\partial u _{n}}{\partial p}

ν_{N}^{T} = \frac{\partial ϕ ( u _{N} )}{\partial u}, ν_{n}^{T} = ν_{n + 1}^{T} \frac{\partial N ( u _{n} )}{\partial u}, n = N - 1, ..., 0

ν_{N}^{T} = \frac{\partial ϕ ( u _{N} )}{\partial u}, ν_{n}^{T} = ν_{n + 1}^{T} \frac{\partial N ( u _{n} )}{\partial u}, n = N - 1, ..., 0

\nabla_{p} ϕ = (\frac{\partial β}{\partial p})^{T} ν_{0}

\nabla_{p} ϕ = (\frac{\partial β}{\partial p})^{T} ν_{0}

\frac{\partial u _{n + 1}}{\partial u} = \frac{\partial u _{n}}{\partial u} + Δ t f_{u} (t_{n + 1}, u_{n + 1}) \frac{\partial u _{n + 1}}{\partial u} = \frac{\partial N ( u _{n} )}{\partial u} \frac{\partial u _{n}}{\partial u},

\frac{\partial u _{n + 1}}{\partial u} = \frac{\partial u _{n}}{\partial u} + Δ t f_{u} (t_{n + 1}, u_{n + 1}) \frac{\partial u _{n + 1}}{\partial u} = \frac{\partial N ( u _{n} )}{\partial u} \frac{\partial u _{n}}{\partial u},

ν_{n}^{T} = ν_{n + 1}^{T} + Δ t ν_{n}^{T} f_{u} (t_{n + 1}, u_{n + 1}) .

ν_{n}^{T} = ν_{n + 1}^{T} + Δ t ν_{n}^{T} f_{u} (t_{n + 1}, u_{n + 1}) .

U

U

W

B (w; c) = L (w), \forall w \in W

B (w; c) = L (w), \forall w \in W

B (w (x); c (x, t_{n}))

B (w (x); c (x, t_{n}))

L (w (x))

w^{T} r (c) \int_{Ω^{e}} [w \cdot F_{0} (c, \nabla c) + \nabla w \cdot F_{1} (c, \nabla c)] d Ω = 0,

w^{T} r (c) \int_{Ω^{e}} [w \cdot F_{0} (c, \nabla c) + \nabla w \cdot F_{1} (c, \nabla c)] d Ω = 0,

r (c) = A_{e=1}^{Nele} [N^{T} B^{T}] W [F_{0} (c_{q}, \nabla c_{q}) F_{1} (c_{q}, \nabla c_{q})]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGroundwater flow and contamination studies · Soil and Unsaturated Flow · Advanced Numerical Methods in Computational Mathematics

Full text

Large-scale Inversion of Subsurface Flow using Discrete Adjoint Method

S. Wang1,2, S. Karra3,∗ and D. O’Malley3

1Department of Electrical Engineering, University of New Mexico, Albuquerque, NM 87131

2National Security Education Center, Los Alamos National Laboratory, Los Alamos, NM 87545.

3Computational Earth Science Group, Earth and Environmental Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545.

∗Corresponding author, [email protected]

1 Introduction
2 Formulation
2.1 Govering equations for subsurface flow
2.1.1 Maximum principle
2.2 PDE-constrained optimization
2.2.1 Discrete adjoint sensitivity analysis
3 Numerical Implementation
3.0.1 PETSc and TAO
3.0.2 Weak formulation
3.0.3 Parallel finite element assembly
4 Results and Performance
4.1 2D Verification
4.1.1 Hetergeneous diffusion in 2D geometry
4.1.2 Hetergeneous diffusion in a unit cube with spherical holes
5 Parallel Performance
5.1 Strong scaling performance of forward run
5.1.1 Scaling performance of multi-threading
5.1.2 Scaling performance of MPI+OpenMP
5.2 Performance modeling
5.3 Strong scaling model
5.4 Weak scaling model
5.5 Real-world Problem

Abstract

Keywords: subsurface, flow, inversion, adjoint method, sensitivity analysis, parallel, high performance computing.

1. Introduction

Inverse analysis plays a key role in developing realistic models for subsurface hydrogeology. This is largely because state variables such as pressure can be observed, but constitutive parameters such as permeability that are needed to make predictions can only be inferred from observations of the state variables. There is a long history of applying inverse methods in subsurface hydrology 1; 2; 3; 4; 5; 6; 7 often via the geostatistical approach8; 6; 5; 9; 10 and using a variety of computational techniques such as dimension reduction11, subspace recycling12, and even quantum computational methods13. Recently, it is increasingly important to calibrate the model to match large data sets obtained from observing pressure transients at a relatively modest number of wells over a long period of time14.

As a result of an inverse analysis, subsurface hydrologic modelers can often use these calibrated models to make accurate predictions related to, e.g., the impacts of pumping at one well on the water supply at another well or the fate of contaminants in groundwater 15. In other cases, the data that has been used to calibrate the model may not be sufficient to use the model in a predictive fashion. When this happens, the calibrated model is often used as a starting point for an uncertainty analysis (e.g., as the starting point in a Markov Chain Monte Carlo or in a null space Monte Carlo method 16). Often these analyses are used to inform decisions (e.g., related to remediating contaminated groundwater 17).

2. Formulation

Let $\Omega\subset\mathbb{R}^{\textit{nd}}$ be a bounded open domain, where “nd” is the number of spatial dimensions. The boundary $\partial\Omega=\bar{\Omega}-\Omega$ is assumed to be piecewise smooth. The boundary is divided into two parts: $\Gamma^{D}$ and $\Gamma^{N}$ . $\Gamma^{D}$ ( $\Omega^{N}$ )is that part of the boundary on which Dirichlet(Neumann) boundary conditions are prescribed. For mathematical well-posedness, we assume $\Gamma^{D}\cup\Gamma^{N}=\partial\Omega$ and $\Gamma^{D}\cap\Gamma^{N}=\emptyset$ . The unit outward normal to boundary is denoted as $\hat{\mathbf{n}}$ . The permeability tensor is denoted by $\mathbf{D}(\mathbf{x{)}}$ , which is assumed to symmetric, bounded above and uniformly elliptic. That is, there exists two constant $0<\epsilon_{1}\leq\epsilon_{2}<\infty$ such that

[TABLE]

2.1. Govering equations for subsurface flow

The governing equation for subsurface flow is given by

[TABLE]

where $\phi$ is the porosity (unitless), $\rho$ is the mass density (kg/m3), $\mu$ is the dynamic viscosity (Pa-s), $k$ is the permeability (m2), $p$ is the pressure (Pa), $Q_{m}$ is the volumetric flow rate (kg/ $m^{3}$ /s). Assuming $\phi$ is constant and that the spatial gradient of density is small, the above equation reduces to

[TABLE]

where $\beta$ is water compressibility ( $\frac{1}{\rho}\frac{\partial\rho}{\partial p}$ , Pa*-1*) and $K$ is the hydraulic conductivity (m/s). Here permeability is connected to hydraulic conductivity by $k=K\mu/(g\rho)$ , and $g$ is the acceleration due to gravity ( $m/s^{2}$ ). We shall denote the pressure field by $c(\mathbf{x})$ . Let us consider the transient flow in heterogeneous porous media governed by the following diffusion equation and boundary/initial conditions

[TABLE]

where $b(\mathbf{x{,}}t)$ is the volumetric source or sink, $u^{p}(\mathbf{x{,}}t),q^{p}(\mathbf{x{,}}t)$ are prescribed pressure and flux respectively. $D(x)$ is the scaled diffusivity as D(x) = $K(\mathbf{x{)}}/(g\rho\phi\beta)$ . For uniqueness, we assume $\Gamma^{D}$ is not empty. This initial-boundary-value-problem(IBVP) is a second-order parabolic partial differential equation(PDE). Let $L=\frac{\partial}{\partial t}-\nabla\cdot{[}{\mathbf{D(\mathbf{x{)}}}(\nabla{)}}]$ denote the operator in Eq. 2.4, it is worthwhile to point out that the adjoint operator is $L^{*}=-\frac{\partial}{\partial t}-\nabla\cdot{[}{\mathbf{D(\mathbf{x{)}}}(\nabla{)}}]$ , where time runs backwards. The adjoint problem to Eq. 2.4 involves adjoint boundary conditions, which is often non-trivial to formulate, especially when irregular boundary configuration is involved.

2.1.1. Maximum principle

The maximum principle of a transient diffusion equation asserts that the maximum can occur only on the boundary of the domain or in the initial condition if $b(\mathbf{x{,}}t)\leq 0$ and $\Gamma^{D}=\partial\Omega$ . Mathematically, a solution to equations (2.18a)–(2.18a) will satisfy:

2.2. PDE-constrained optimization

Determining parameters of a partial differential equations(PDE) model is often formulated as a PDE-constrained optimization problem where the field values mathch observations. This is also referred as inverse problem. Such problems take the form,

[TABLE]

where $u$ , $p$ , $J(u,p)$ and $F(u,p)$ are field value, parameters in PDE, objective function and PDE induced constraints. From a optimization point of view, it is required that $u$ be feasible at every step in $p$ when $J(u,p)$ converges to a minimizer. The necessary ingredients of a capable optimization solver for Eq. 2.2 should: 1)be able to solve $F(u,p)=0$ (PDE or forward problem solver); 2) evaluate $J(u,p)$ ; 3) provide the gradient $\text{d}_{p}J$ . Among those problems, time-dependent ones arise wide attentions for such a reason that forward problems are often treated by the method-of-line which induces a system of ODE. The adjoint equation to the probelm is also an ODE, which means that they both can be solved by the same standard ODE integrators. The adjoint method of time-dependent problem comes in the form,

[TABLE]

The ith total derivative(gradient) is denoted as,

[TABLE]

The corresponding Lagrangian of 2.5 can be written as

[TABLE]

where $\nu_{i}$ and $\mu_{i}$ are vectors of Lagrangian multipliers as function of time. They are also named by adjoint vectors. Since only equality constraints are involved, we are free to set values of $\nu_{i}$ and $\mu_{i}$ . Also note that $d_{p}\mathcal{L}_{i}=d_{p}\Psi_{i}$ , the total derivative is,

[TABLE]

where integration by part is used. The term $\text{d}_{p}u|_{T}$ is non-trivial to obtain, thus we set $\nu_{i}|_{T}=0$ to make the whole term vanish. By setting $\mu_{i}^{T}\rvert_{0}=\nu_{i}^{T}\partial_{\dot{u}}F\rvert_{0}$ , evaluation of term $\text{d}_{p}u(0)$ is avoided. Recursively, we can avoid computing $\text{d}_{p}u$ for all $t>0$ by setting

[TABLE]

The following algorithm describes how $\text{d}_{p}\Psi_{i}$ is computed:

The output $\text{d}_{p}\Psi_{i}$ is the Jacobian matrix which is associated with the sensitivity on $p$ . It only takes 1 forward and 1 adjoint(inverse) run, the Jacobian is yielded. As a compare, differentiation based approach needs to take $\text{dim}(p)$ forward runs. The advantages get siginificant when $n_{\text{obj}}\gg\text{dim}(p)$ .

2.2.1. Discrete adjoint sensitivity analysis

There are several ways to solve Eq. 2.4, here the method of line approach is adopted, which resulting a system of ordinary differential equations(ODE) as,

[TABLE]

where $u(t)$ is the spatial discretization of flow field $u(\mathbf{x{,}}t)$ . $\mathcal{M}$ is the mass matrix which is usually symmetric-positive definite. Here assume $\mathcal{M}$ is identity for brevity of notations. The right-hand-side $f(t,u)$ involves the contribution from the parameters of model(permeability distribution $\mathbf{D}(\mathbf{x{)}}$ ). Let us consider a simples t forward integration scheme, backward Euler, for 2.10 as

[TABLE]

Now define the sensitivity variable as $\mathbf{S}_{l,n}=\partial u_{n}/\partial p_{l}$ , where $p_{l}$ means the $p$ th parameters in the model. The sensitivity equation corresponding to $p_{l}$ is immediately obtained after pluging $\mathbf{S}_{l,n}$ into Eq. 2.11,

[TABLE]

where $f_{u}$ and $f_{p}$ are Jacobian matrices. As we can see that the trajactory of $\mathbf{S}_{l,n}$ follows a similar trajactory with model’s state variable in the forward process. To be general, use $u_{n+1}=\mathcal{N}_{n}(u_{n}),n=0,...,N-1$ to denote any one-step integration scheme. In our implementation , the objective function $\Phi$ is chosen to only involve the terminal term under the effect of parameters $p$ as

[TABLE]

The constraints of the optimization problem are chosen to be the discretized PDE at each time step. Therefore, the Lagrangian is written as

[TABLE]

,where $\nu_{0},...,\nu_{N}$ are Lagrange multipliers. We use $\phi(u_{N})$ to approximate $\phi(u(T))$ . Differentiating this function with respect to $p$ yields

[TABLE]

Let $\partial\mathcal{L}/\partial p=0$ and define

[TABLE]

The gradient of target objective function is

[TABLE]

Now treat $\mathcal{N}(u)$ as a implicit function and use backward Euler as example. Take derivative of $u$ in Eq. 2.11, we get

[TABLE]

Combining with Eq. 2.16, the discrete adjoint equation is formulated as,

[TABLE]

3. Numerical Implementation

3.0.1. PETSc and TAO

We leverage on scientific libraries such as PETSc and TAO to implement the large-scale inversion’s computation. PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by PDEs, implementing MPI standard and widely used in parallel finite element codes development. It also provides interfaces to several other libraries such as Metis/ParMETIS and HDF5 for mesh partitioning and binary data format handling respectively. To solve the large-scale optimization problem, another important feature with PETSc, TAO, is employed. Our non-negative methodology will use the Bounded Limited-Memory Variable-Metric(BLMVM) solver available in TAO to approximate the Hessian, and this is efficient in large-scale context. Other optization algorithm like Conjugate Gradient(CG) and Limited-Memory Variable-Metric(LMVM) will be compared in the convergence and memory consumption. Further details regarding the implementation of these various methods may be found in and the references within.

3.0.2. Weak formulation

Continuous Galerkin approach is adopted for the FE setup. The trial and test function spaces are chosen to be

[TABLE]

The weak form for Eq. 2.4 reads: find $c(\mathbf{x{)}}\in\mathcal{U}$ , such that

[TABLE]

where the bilinear form and linear functional are, respectively, defined as

[TABLE]

The assembly of mass/stiffness matrix, Gaussian quadrature and other routines are implemented in-house while the parallel matrix/vector operations are interfaced with in PETSc’s build-in. First, following the FE model outlined in [19], we consider the weak form that depends on fields and gradients. The residual evaluation can be expressed as:

[TABLE]

where $\mathcal{F}_{0}(c,\nabla{c})$ and $\mathcal{F}_{1}(c,\nabla{c})$ are point-wise functions that capture the physics. This framework decouples the problem specification from the mesh and degree of freedom traversal which easy the implementation on distributed memory machines. The discretization of the residual is written as:

[TABLE]

where $\mathbf{A}$ represents the assembly operator, $N$ and $B$ are matrix forms of basis functions over quadrature points, diagonal matrix $W$ is the quadurature weights, and $c_{q}$ is the field value at quadrature point $q$ . Mapping back to Eq. 3.3,

[TABLE]

here the superscript $n$ denotes for time step and also assume $q^{p}(\mathbf{x{)}}=0$ for simplicity. Naturally, the Jacobian is the derivatives of Eq.3.5 as,

[TABLE]

The point wise functions are

[TABLE]

3.0.3. Parallel finite element assembly

In each optimization step, one forward and adjoint run are conducted and each run is a solution of time-dependent problem. The PETSc interface for solving time dependent problems assumes the problem is written in the form

[TABLE]

User has to provide how to evaluate the residual and Jacobian from $F(t,u,\dot{u})$ using interface functions "TSSetIFunction" and "TSSetIJacobian". Take backward Euler scheme applied to $F(t,u,\dot{u},p)=0$ as example, the time derivate $\dot{u}=(u_{n}-u_{n-1})/\Delta t$ , it results in the Jacobian $\partial_{u^{n}}F=I/\Delta t+\partial_{u}F(t,u,\dot{u},p)$ . As a result, evalulation of the Jacobian for each forward/adjoint run is required since it is a function of $p$ . But within one forward(or adjoint)run, it just has been computed once if fixed time step is assumed. Apart from matrix systems solution by Krylov method, matrices assembly is another bottle-neck when going to large scale.

This paper considers a hybrid framework of parallelism on both shared-memory(OpenMP) and distributed-memory(MPI) level. In dealing with shared memory machines, the assembly of stiffness matrices in FE will encounter race condition if two adjacent elements are assembled at the same time by two threads. With the help of graph coloring, the elements can be assembled one color at a time, thus preventing race condition. In order to do the coloring, the indices of neighboring elements are necessary ,which can be readily obtained from the adjacency graph of the mesh. Take the triangular mesh in Fig 1 as example, the corresponding graph and one possible coloring(4-colored) are shown.

Since the test and trial functions are nodal based, two elements are considered to be connected once they share at least one node. Elements in the same color now are safe to be assembled by different threads.

4. Results and Performance

4.1. 2D Verification

4.1.1. Hetergeneous diffusion in 2D geometry

Convergence with tao types

4.1.2. Hetergeneous diffusion in a unit cube with spherical holes

Let the computational domain be a unit cube with two spherical holes of radius 0.2 and 0.35. The concentration on the outer boundary is taken to be zero and the concentration on the interior boundary is taken to be unity. The volumetric source is taken as zero (i.e., f (x) = 0). The velocity vector field for this problem is chosen to be

The estimation of diffusivity for mesh type B after three optimization steps are shown below.

5. Parallel Performance

5.1. Strong scaling performance of forward run

1million hex grid forward run

5.1.1. Scaling performance of multi-threading

The multi-threading is implemented with OpenMP on multi-core CPUs.

5.1.2. Scaling performance of MPI+OpenMP

5.2. Performance modeling

Since the inversion process involves both forward and backward runs, as well as optimization steps, there will be fraction of the code that is not amenable to parallelization. Here we employ Amdahl’s law and Gustafson’s law to model strong and weak scaling respectively.

5.3. Strong scaling model

Amdahl’s law can be formulated as follows

[TABLE]

where s is the proportion of execution time spent on the serial part and N is the number of processors. Amdahl’s law states that, for a fixed problem, the upper limit of speedup is determined by the serial fraction of the code. Here we tested on three types of mesh with unknown sizes being 0.25, 1 and 4 million. The results and fitted model are shown below.

As dipicted in the figure, the fraction of serial part of the code decreases with the increasing of problem size. Thus we expect better strong scaling performance for large problem.

5.4. Weak scaling model

The sizes of problems scale with the amount of available resources in real applications. A more reasonable choice is to use small amounts of resources for small problems and larger quantities of resources for big problems. Amdahl’s law gives the upper limit of speedup for a problem of fixed size. For measuring the weak scaling, where the scaled speedup is calculated based on the amount of work done for a scaled problem size (in contrast to Amdahl’s law which focuses on fixed problem size), Gustafson’s law is a more wise choice. It is based on the approximations that the parallel part scales linearly with the amount of resources, and that the serial part does not increase with respect to the size of the problem. It provides the formula for scaled speedup as:

[TABLE]

, where $s$ and $N$ has the same meaning as in Amdahl’s law. Here we fix the number of cells per processor and increase the number of processors. The results and fitted model are shown below.

We observe that, as incresing the workload per processor, the weak scaling gets worse, together with the proportion of serial part increases. This can be explained by the fact that the optimizer, which is the major serial part, takes more efforts to find next optimization direction. Also notice that the discrepancy in s between strong/weak scaling modeling. This is attributed to the approximations in the laws — the serial fraction is assumed to remain constant, and the parallel part is assumed to be speed up in proportion to the number of processors. In practice, the overhead of parallelization may also increase with the job size (e.g. from the scheduling of threads), and in this case it is understandable that the weak scaling model gives a larger serial fraction s.

5.5. Real-world Problem

We consider a real world problem of subsurface flow in this section. The parameters of simulation domain is described in the table below. In the first case, a 2D model is considered. The inversion run is carried out based on the observation of pressure collected at day 1 and day 150. In this simulation, only the flows on x-y plane are considered. The initial pressure is only collected at 25 locations and the background pressure is assumed to be $1.0\times 10^{6}$ Pa. The pressure at day 1 and 150 are shown in Fig. 11(b).

In Fig. 12(b), the diffusivity field after 50 and 110 TAO iterations are plotted as compared to the true diffusivity. The convergence history is also shown.

Due to the sparsity of the observations, the inversion could only reveals the diffusivity field at sample locations.

3D example, In the z direction, 2 kilometer. The initial condition is a steady state(run >1000 days from the funky ic I generated). Sinks are prescribed at 5 locations and run forwardly for 200 days. For the initial condition and observation data, please see Fig.13(c)

The parameters for this module is : g=9.8m/ $s^{2}$ , $\rho=1.0\times 10^{3}$ kg/ $m^{3}$ ; $\phi=0.1$ ; $\beta=5.0\times 10^{-10}$ $Pa^{-1}$ and K=9.8 $\times 10^{-11}$ m/s to 2.94 $\times 10^{-9}$ m/s

Run the inversion on 64 nodes for 160 TAO iterations. The inverted diffusivity as compared to true distribution are shown in Fig. 14.

And the convergence plot.

Conclusions

Acknowledgments

SK thanks BER for support. SW thanks LANL Parallel Computing Summer School for support.

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

11 S. P. Neuman and S. Yakowitz. A statistical approach to the inverse problem of aquifer hydrology: 1. theory. Water Resources Research , 15(4):845–860, 1979.
22 S. P. Neuman, G. E. Fogg, and E. A. Jacobson. A statistical approach to the inverse problem of aquifer hydrology: 2. case study. Water Resources Research , 16(1):33–58, 1980.
33 J. Carrera and S. P. Neuman. Estimation of aquifer parameters under transient and steady state conditions: 1. maximum likelihood method incorporating prior information. Water Resources Research , (22):199–210, 1986.
44 N. Sun. Inverse problems in groundwater modeling . Kluwer Academic Publishers, 1994.
55 P.K. Kitanidis. Introduction to Geostatistics: Applications to Hydrogeology . Stanford-Cambridge program. Cambridge University Press, 1997.
66 J. Zhang and T-C J. Yeh. An iterative geostatistical inverse method for steady flow in the vadose zone. Water Resources Research , 33(1):63–71, 1997.
77 Jesús Carrera, Andrés Alcolea, Agustín Medina, Juan Hidalgo, and Luit J Slooten. Inverse problem in hydrogeology. Hydrogeology journal , 13(1):206–222, 2005.
88 Peter K Kitanidis. Quasi-linear geostatistical theory for inversing. Water resources research , 31(10):2411–2419, 1995.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Large-scale Inversion of Subsurface Flow using Discrete Adjoint Method

Contents

Abstract

1. Introduction

2. Formulation

2.1. Govering equations for subsurface flow

2.1.1. Maximum principle

2.2. PDE-constrained optimization

2.2.1. Discrete adjoint sensitivity analysis

3. Numerical Implementation

3.0.1. PETSc and TAO

3.0.2. Weak formulation

3.0.3. Parallel finite element assembly

4. Results and Performance

4.1. 2D Verification

4.1.1. Hetergeneous diffusion in 2D geometry

4.1.2. Hetergeneous diffusion in a unit cube with spherical holes

5. Parallel Performance

5.1. Strong scaling performance of forward run

5.1.1. Scaling performance of multi-threading

5.1.2. Scaling performance of MPI+OpenMP

5.2. Performance modeling

5.3. Strong scaling model

5.4. Weak scaling model

5.5. Real-world Problem

Conclusions

Acknowledgments