An adaptive Newton algorithm for optimal control problems with   application to optimal electrode design

Thomas Carraro; Simon D\"orsam; Stefan Frei; Daniel Schwarz

arXiv:1706.00632·math.OC·June 5, 2017·J. Optim. Theory Appl.

An adaptive Newton algorithm for optimal control problems with application to optimal electrode design

Thomas Carraro, Simon D\"orsam, Stefan Frei, Daniel Schwarz

PDF

TL;DR

This paper introduces an adaptive Newton algorithm for nonlinear PDE-constrained optimization, balancing discretization and iteration errors with goal-oriented error estimation, demonstrated through efficient electrode design in neuroscience.

Contribution

It presents a novel adaptive Newton method that integrates goal-oriented error estimation for efficient PDE-constrained optimization.

Findings

01

Efficient balancing of discretization and iteration errors.

02

Local mesh refinement improves computational efficiency.

03

Successful application to optimal electrode design in neuroscience.

Abstract

In this work we present an adaptive Newton-type method to solve nonlinear constrained optimization problems in which the constraint is a system of partial differential equations discretized by the finite element method. The adaptive strategy is based on a goal-oriented a posteriori error estimation for the discretization and for the iteration error. The iteration error stems from an inexact solution of the nonlinear system of first order optimality conditions by the Newton-type method. This strategy allows to balance the two errors and to derive effective stopping criteria for the Newton-iterations. The algorithm proceeds with the search of the optimal point on coarse grids which are refined only if the discretization error becomes dominant. Using computable error indicators the mesh is refined locally leading to a highly efficient solution process. The performance of the algorithm is…

Tables4

Table 1. Table 1: Splitting of the error indicators η h subscript 𝜂 ℎ \eta_{h} and η K K T subscript 𝜂 𝐾 𝐾 𝑇 \eta_{KKT} for a damped Newton method for the test problem on the slit domain.

Iteration	$ℐ (q) - ℐ (q_{h})$	$η_{K K T}$	$η_{h}$	$ℐ_{e f f}$
1	-2.09e-01	-1.82e-01	1.81e-04	0.87
3	-6.69e-02	-6.38e-02	2.15e-04	0.95
5	-1.81e-02	-1.85e-02	2.30e-04	1.01
7	-4.14e-03	-4.86e-03	2.35e-04	1.12
9	-4.90e-04	-1.23e-03	2.36e-04	2.03
11	4.33e-04	-3.08e-04	2.37e-04	-0.16
13	6.64e-04	-7.72e-05	2.37e-04	0.24
15	7.22e-04	-1.93e-05	2.37e-04	0.30
17	7.37e-04	-4.82e-06	2.37e-04	0.31
19	7.41e-04	2.46e-15	2.37e-04	0.32

Table 2. Table 2: Comparison of Newton steps and computational time for the mesh-adaptive and the fully adaptive algorithms for the modified, non-linear state equation ( 6.1 ).

		mesh adaptive		fully adaptive
#dofs	$ℐ (q) - ℐ (q_{h})$	#steps	time[s]	#steps	time[s]
54	1.3e-01	6	2	3	1
102	6.1e-02	4	5	1	3
176	3.4e-02	5	9	2	6
358	1.8e-02	5	13	2	9
578	9.3e-03	5	18	2	13
742	5.9e-03	5	28	2	20
1880	2.6e-03	5	39	2	27
2232	1.7e-03	5	65	2	44
6498	7.0e-04	5	98	2	65
8186	4.4e-04	5	138	2	91
10024	3.2e-04	5	221	2	161

Table 3. Table 3: Number of Newton steps and computational times for the mesh-adaptive and the fully adaptive algorithm for optimal electrode design.

		mesh adaptive		fully adaptive
#dofs	$ℐ (q) - ℐ (q_{h})$	#steps	time[s]	#steps	time[s]
2754	-1.0e+01	4	34	1	17
3222	-6.2e+00	4	80	1	41
3670	3.8e-01	3	128	1	70
6426	2.7e-01	3	205	1	119
12506	1.1e-01	3	335	1	197
26506	6.1e-02	2	520	1	338
57068	2.7e-02	2	831	1	575
95894	1.3e-02	2	1313	1	943

Table 4. Table 4: Simultaneous optimization of size and position of openings. Alternately, the position of the holes are optimized in step ’a’ and the size of the holes in step ’b’, while keeping the other respective parameters fixed.

Step	$m_{1}$	$m_{2}$	$s_{1}$	$s_{2}$	$J (u, q, s)$
0	10.0	20.0	0.41	0.30	6214
1a	4.8	19.7	”	”	6021
1b	”	”	0.39	0.30	6020
2a	5.1	19.7	”	”	6020
2b	”	”	0.38	0.30	6019
⋮	⋮	⋮	⋮	⋮	⋮
OPT	5.9	19.7	0.35	0.30	6018

Equations181

q \in R^{s}, u \in V min J (u, q)

q \in R^{s}, u \in V min J (u, q)

s . t . A (u, q; φ)

L : V \times R^{s} \times V \to R, L (u, q, λ) = J (u, q) + A (u, q; λ) .

L : V \times R^{s} \times V \to R, L (u, q, λ) = J (u, q) + A (u, q; λ) .

L_{u}^{'} (u, q, λ) (δ u)

L_{u}^{'} (u, q, λ) (δ u)

L_{q}^{'} (u, q, λ) (δ q)

L_{λ}^{'} (u, q, λ) (δ λ)

q \in R^{s}, u \in V min J (u, q)

q \in R^{s}, u \in V min J (u, q)

s . t . σ (\nabla u, \nabla φ)_{Ω}

L_{u}^{'} (w) (δ u)

L_{u}^{'} (w) (δ u)

L_{q}^{'} (w) (δ q)

L_{λ}^{'} (w) (δ λ)

A (w; δ w) := (δ u, u - \overset{u}{^})_{Ω^{s}} + σ (\nabla δ u, \nabla λ)_{Ω} + α (δ q, q) - (f^{'} (q) (δ q), λ)_{Ω} + σ (\nabla u, \nabla δ λ)_{Ω} - (f (q), δ λ)_{Ω}

A (w; δ w) := (δ u, u - \overset{u}{^})_{Ω^{s}} + σ (\nabla δ u, \nabla λ)_{Ω} + α (δ q, q) - (f^{'} (q) (δ q), λ)_{Ω} + σ (\nabla u, \nabla δ λ)_{Ω} - (f (q), δ λ)_{Ω}

A (w; δ w) = 0 \forall δ w in V \times R^{s} \times V .

A (w; δ w) = 0 \forall δ w in V \times R^{s} \times V .

A (w_{h}; δ w) = 0 \forall δ w in V_{h} \times R^{s} \times V_{h} .

A (w_{h}; δ w) = 0 \forall δ w in V_{h} \times R^{s} \times V_{h} .

\nabla L (w) (δ w) =! 0 \forall δ w in V \times R^{s} \times V,

\nabla L (w) (δ w) =! 0 \forall δ w in V \times R^{s} \times V,

\nabla L (w_{h}) (δ w) =! 0 \forall δ w in V_{h} \times R^{s} \times V_{h} .

\nabla L (w_{h}) (δ w) =! 0 \forall δ w in V_{h} \times R^{s} \times V_{h} .

e (u, q, λ) := I (u, q, λ) - I (u_{h}, q_{h}, λ_{h}) .

e (u, q, λ) := I (u, q, λ) - I (u_{h}, q_{h}, λ_{h}) .

ρ (w_{h}) (φ) :=

ρ (w_{h}) (φ) :=

+ α (φ^{q}, q_{h}) - (f^{'} (q_{h}) (φ^{q}), λ_{h})_{Ω}

+ σ (\nabla u_{h}, \nabla φ^{λ})_{Ω} - (f (q_{h}), φ^{λ})_{Ω}

(z^{u}, δ u)_{Ω^{s}} + σ (\nabla z^{λ}, \nabla δ u)_{Ω}

(z^{u}, δ u)_{Ω^{s}} + σ (\nabla z^{λ}, \nabla δ u)_{Ω}

α (z^{q}, δ q)_{Ω} - (z^{λ}, f^{'} (q) (δ q))_{Ω} - (λ, f^{''} (q) (δ q))_{Ω}

σ (\nabla z^{u}, \nabla δ λ)_{Ω} - (δ λ, f^{'} (q) (z^{q}))_{Ω}

A^{*} (z, w) (δ w) = - I_{w}^{'} (w) (δ w) \forall δ w \in V \times R^{s} \times V

A^{*} (z, w) (δ w) = - I_{w}^{'} (w) (δ w) \forall δ w \in V \times R^{s} \times V

A^{*} (z, w) (δ w)

A^{*} (z, w) (δ w)

- (λ, f^{''} (q) (δ q))_{Ω} + σ (\nabla z^{u}, \nabla δ λ)_{Ω} - (f^{'} (q) (z^{q}), δ λ)_{Ω} .

A^{*} (z_{h}, w_{h}) (δ w) = - I_{w}^{'} (w_{h}) (δ w) \forall δ w \in V_{h} \times R^{s} \times V_{h} .

A^{*} (z_{h}, w_{h}) (δ w) = - I_{w}^{'} (w_{h}) (δ w) \forall δ w \in V_{h} \times R^{s} \times V_{h} .

ρ^{*} (w_{h}, z_{h}) (ψ) :=

ρ^{*} (w_{h}, z_{h}) (ψ) :=

\displaystyle+\alpha\bigl{(}z^{q}_{h},\psi^{q}\bigr{)}_{\Omega}-\bigl{(}z^{\lambda}_{h},f^{\prime}(q_{h})(\psi^{q})\bigr{)}_{\Omega}-\bigl{(}\lambda_{h},f^{\prime\prime}(q_{h})(\psi^{q})\bigr{)}_{\Omega}+\mathcal{I}^{\prime}_{q}(w_{h})(\psi^{q})

\displaystyle+\sigma\bigl{(}\nabla z^{u}_{h},\nabla\psi^{\lambda}\bigr{)}_{\Omega}-\bigl{(}f^{\prime}(q_{h})(z^{q}),\psi^{\lambda}\bigr{)}_{\Omega}+\mathcal{I}^{\prime}_{\lambda}(w_{h})(\psi^{\lambda}),

I (w) - I (w_{h}) = \frac{1}{2} ρ (w_{h}) (z - z_{h}) + \frac{1}{2} ρ^{*} (w_{h}, z_{h}) (w - w_{h}) + R

I (w) - I (w_{h}) = \frac{1}{2} ρ (w_{h}) (z - z_{h}) + \frac{1}{2} ρ^{*} (w_{h}, z_{h}) (w - w_{h}) + R

\displaystyle R=\frac{1}{2}\int_{0}^{1}\bigl{\{}\mathcal{I}^{\prime\prime\prime}(w_{h}+se)(e,e,e)-\mathcal{A}^{\prime\prime\prime}(w_{h}+se;z_{h}+se^{*})(e,e,e)-3\mathcal{A}^{\prime\prime}(w_{h}+se;e^{*})(e,e)\bigr{\}}s(s-1)\mathrm{d}s

\displaystyle R=\frac{1}{2}\int_{0}^{1}\bigl{\{}\mathcal{I}^{\prime\prime\prime}(w_{h}+se)(e,e,e)-\mathcal{A}^{\prime\prime\prime}(w_{h}+se;z_{h}+se^{*})(e,e,e)-3\mathcal{A}^{\prime\prime}(w_{h}+se;e^{*})(e,e)\bigr{\}}s(s-1)\mathrm{d}s

L (u, q, λ, z^{u}, z^{q}, z^{λ}) = I (u, q, λ) - L_{u}^{'} (u, q, λ) (z^{u}) - L_{λ}^{'} (u, q, λ) (z^{λ}) - L_{q}^{'} (u, q, λ) (z^{q}) .

L (u, q, λ, z^{u}, z^{q}, z^{λ}) = I (u, q, λ) - L_{u}^{'} (u, q, λ) (z^{u}) - L_{λ}^{'} (u, q, λ) (z^{λ}) - L_{q}^{'} (u, q, λ) (z^{q}) .

L (x, z) := I (x) - A (x; z)

L (x, z) := I (x) - A (x; z)

I (w) - I (w_{h}) = L (x) + A (w; z) - L (x_{h}) - A (w_{h}; z_{h}) = L (x) - L (x_{h}),

I (w) - I (w_{h}) = L (x) + A (w; z) - L (x_{h}) - A (w_{h}; z_{h}) = L (x) - L (x_{h}),

L (x) - L (x_{h}) = \int_{0}^{1} L^{'} (x + s (x - x_{h})) (e) d s,

L (x) - L (x_{h}) = \int_{0}^{1} L^{'} (x + s (x - x_{h})) (e) d s,

\displaystyle\int_{0}^{1}f(s)\mathrm{d}s=\frac{1}{2}\big{(}f(0)+f(1)\big{)}+\frac{1}{2}\int_{0}^{1}f^{\prime\prime}(s)s(s-1)\mathrm{d}s.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

An adaptive Newton algorithm for optimal control problems

with application to optimal electrode design

Thomas Carraro1 [email protected] 1Institute for Applied Mathematics, Heidelberg University

Simon Dörsam1

1Institute for Applied Mathematics, Heidelberg University

Stefan Frei1 and Daniel Schwarz2

1Institute for Applied Mathematics, Heidelberg University

Abstract

In this work we present an adaptive Newton-type method to solve nonlinear constrained optimization problems in which the constraint is a system of partial differential equations discretized by the finite element method. The adaptive strategy is based on a goal-oriented a posteriori error estimation for the discretization and for the iteration error. The iteration error stems from an inexact solution of the nonlinear system of first order optimality conditions by the Newton-type method. This strategy allows to balance the two errors and to derive effective stopping criteria for the Newton-iterations. The algorithm proceeds with the search of the optimal point on coarse grids which are refined only if the discretization error becomes dominant. Using computable error indicators the mesh is refined locally leading to a highly efficient solution process. The performance of the algorithm is shown with several examples and in particular with an application in the neurosciences: the optimal electrode design for the study of neuronal networks.

1 Introduction

In this work we consider the optimal design of a glass micro-electrode for the use of reversible in vivo electroporation in neural tissue. Electroporation describes the increase in permeability of the cell membrane by the application of an external electric field beyond a certain threshold [32, 36]. While this technique has been known at least since the 1960’s [19], it has become a standard tool in the neurosciences in more recent years to load single cells and small ensembles of neurons with a range of dyes and molecules, for example for the visualization of neural networks [18, 24, 25], see Figure 1 on the left.

In order to make the plasma membrane permeable for a specific dye, the local voltage has to exceed a certain threshold. On the other hand the applied stimulus can not be increased infinitely, as high peaks of current would cause collateral damage [25, 16]. A way to reduce such unwanted side-effects is to modify the shape of the micro-electrodes, in order to obtain a more uniform distribution of the electric field. While standard electrodes have a single hole at the tip, adding more holes on the side of the pipette seems a promising approach. Recent work has shown that nanoengineering techniques are indeed available to shape glass micro-electrodes in the tip region using focused ion beam assisted milling [22, 31], see Figure 1 on the right. It has been shown that the part of the neuronal network, that can be visualized with these modified pipettes, is considerably enlarged in comparison to the standard design [31], see Figure 2 for a numerical demonstration.

The objective of this work is to design an optimal electrode in terms of position and size of holes in the micro-pipette by using methods of numerical optimization. The scientific contribution of this work is twofold: (i) on one side we present a mathematical formulation of the optimal design of a micro-pipette; (ii) on the other side we present an adaptive Newton method for the solution of the corresponding optimization problem.

The model used to describe the electric field is a partial differential equation (PDE). Therefore, we deal with a PDE constrained optimization problem. In the context of PDE constrained optimization problems the two common solution methods are the reduced and the all-at-once approach [20]. We will adopt the latter one in which the optimality conditions are expressed as Karush-Kuhn-Tucker (KKT) system. There is a large literature on this topic and we refer for example to the books [15, 20, 23] for a thorough introduction. Regarding the specific application there are no systematic studies that use a model based approach to design the micro-pipette used in electroporation. Therefore, the results shown here are of scientific interest even if they are obtained in a simplified setting with a two-dimensional problem. The extension to three-dimensional problems with a more complex model is possible within the same adaptive algorithm.

Mesh adaptivity is in many aspects well established in the context of finite element discretization of linear and nonlinear partial differential equations, see e.g. [2, 33]. Furthermore, goal oriented a posteriori error estimation has been successfully used in many applications, see the seminal works [6, 4] for an overview of the Dual Weighed Residual (DWR) technique and exemplarily [9, 30, 34, 8] for some specific applications. A posteriori error estimation methods have been used to control the discretization error either in global norms, e.g. the $L^{2}$ or energy norm, or in specific functionals in the context of goal oriented techniques.

To solve the nonlinear system arising from the discretization of the underlying problem typically a Newton-type method is used. If the Newton iteration is stopped after reaching a given tolerance, there is an iteration error that has to be taken into account in addition to the discretization error. In particular, it is advantageous to control the iteration error and allow the Newton-iterates to stop before full convergence (i.e. to machine precision), because each Newton-iteration comes at the cost of the solution of a large linear system. The latter might be badly conditioned, especially in the context of multi-physics and optimization problems, leading to a large number of iterations of an iterative linear solver. There are only few results on a posteriori error estimation that combine an estimation of the discretization error and of the iteration error, resulting in algorithms that have stopping criteria based on balancing the two sources of error.

In the last few years increasing attention has been given to adaptive strategies to solve nonlinear problems including those arising from discretizations of partial differential equations. Ziems and Ulbrich have presented in [37] a class of inexact multilevel trust-region sequential quadratic programming (SQP) methods for the solution of nonlinear PDE-constrained optimization problems, in which the discretization error in global norms is controlled by local error estimators including control of the inexactness of the iterative solvers. Further works can be found outside the optimization context. A list of relevant publications is here given:

Bernardi and coauthors have shown an a posteriori analysis of iterative algorithms for nonlinear problems [7], Rannacher and Vihharev have balanced the discretization error and the iteration error in a Newton-type solver [27]; Ern and Vohralík have developed an adaptive strategy for inexact Newton methods based on a posteriori error analysis [14] and Wihler and Amrein have presented an adaptive Newton-Galerkin method for semi-linear elliptic PDEs which combines an error estimation for the Newton step and an error estimation for the discretization with finite elements [1].

Since the goal of a simulation is the computation of a specific quantity of interest, for example in our case the optimal micro-pipette design (i.e. the position and dimension of the side holes), it is desirable to optimize the mesh refinement in a goal-oriented fashion. Furthermore, also the stopping criterion for the Newton iteration should be goal-oriented. This allows, for example in the context of optimization, to approximate the optimal point on coarse meshes and refine only once the discretization error becomes dominant. In this way we reach the full balance of error sources with respect to the quantity of interest and the algorithm does the costly iterates (on fine meshes) only after the nonlinearities have been adequately solved on cheaper meshes. Consequently the computational costs are reduced by keeping the precision of the simulation at the desired level. The new contribution of our work in this context is the derivation of a goal-oriented strategy for the adaptive control of a Newton-type algorithm to solve a nonlinear PDE-constrained optimization problem.

This work is organized as follows. In Section 2 we formulate the general optimization problem; in Section 3 we present our adaptive strategy; in Section 4 we introduce the application in optimal electrode design; in Section 5 we delineate the algorithms and in Section 6 we present some numerical results. Finally, in Section 7, an outlook to possible extensions of the presented method is given.

2 Optimization problem

We consider the following optimization problem with parameters $q\in\mathbb{R}^{s},s\in\mathbb{N}$

[TABLE]

We assume that ${\cal V}$ is a reflexive Banach space. Let $A:{\cal V}\times\mathbb{R}^{s}\times{\cal V}\to\mathbb{R}$ be a semi-linear form and $f(q)\in{\cal V}^{*}$ for every $q\in\mathbb{R}^{s}$ , where $\cal{V}^{*}$ denotes the dual space of $\cal V$ . Furthermore, we assume that $J$ and $A$ are twice (Fréchet) differentiable and that for each $q\in\mathbb{R}^{s}$ the state equation (2) has a unique solution $u$ . Let us denote the (nonlinear) control-to-state map by $S:\mathbb{R}^{s}\to{\cal V}$ .

Under these assumptions we can consider a reduced formulation of the optimization problem, with a reduced objective functional $j(q):=J(q,S(q)):\mathbb{R}^{s}\rightarrow\mathbb{R}$ . If the reduced objective functional is coercive the existence of local minimizers to (1)-(2) follows by standard arguments, see e.g. [15, 20]. The coercivity assumption is needed in case of unconstrained optimization problems to assure boundedness of the minimizing sequence. Therefore, for the practical solution of the problem, we consider a Tikhonov regularization term in the objective functional. If in addition the functional is convex, the optimization problem has a unique solution. Since in this work we allow nonlinearities in the model, we cannot assume convexity of the reduced functional. Therefore, the theoretical results assure only the existence of local minimizers.

To derive the optimality conditions, we introduce the Lagrange functional

[TABLE]

The first-order necessary optimality conditions are given by the KKT system

[TABLE]

The first equation corresponds to the dual equation for the adjoint variable $\lambda$ , the second equation is called the control equation and the third equation is the state equation for the primal variable $u$ .

2.1 Model problem

To simplify the notation in the introduction of the error estimator in the next section, we consider a model problem of the form

[TABLE]

where ${\cal V}:=H^{1}_{0}(\Omega)$ , $\Omega\subset\mathbb{R}^{2}$ , $\alpha$ and $\sigma$ are positive real numbers, $(\cdot,\cdot)_{\Omega}$ denotes the $L^{2}$ scalar product and $\Omega^{s}\subset\Omega$ . The corresponding KKT system reads

Problem 2.1 (KKT system of the model problem)

Find $w:=(u,q,\lambda)\in\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}$ such that

[TABLE]

By introducing the semi-linear form

[TABLE]

we can write the KKT system in compact form as

[TABLE]

The derivation of a corresponding adaptive Newton method for other functionals $J$ and semi-linear forms $A$ fulfilling the assumptions made above is straight-forward given that the KKT system is solvable with a Newton-type solver. The modification of the optimization problem to the specific application presented in this paper will be made later in Section 4.

2.2 Discretization

We choose conforming finite element spaces ${\cal V}_{h}\subset{\cal V}$ for the state variable $u_{h}$ and the dual variable $\lambda_{h}$ . The control space $\mathbb{R}^{s}$ is already finite dimensional, therefore we do not need a discretization of the control variable. The discrete optimality system reads

Problem 2.2 (Discrete KKT system of the model problem)

Find $u_{h}\in\mathcal{V}_{h}$ , $q_{h}\in\mathbb{R}^{s}$ and $\lambda_{h}\in\mathcal{V}_{h}$ , such that

[TABLE]

An essential problem in solving a discretized PDE system is the choice of the computational mesh on which depends the discretization error, i.e. the error due to the finite dimensional approximation given by the finite elements.

3 Adaptive strategy

In the case of optimization problems it is of interest to control the accuracy of the solution of the first-order optimality conditions. The accuracy depends on the discretization error and it “measures” the quality of the approximation of the optimal point, i.e. of the optimal control and optimal state. In the context of PDE constrained optimization problems, the two typical methods to solve the problem are the reduced approach and the all-at-once approach. Here we use the all-at-once approach, in which the optimality conditions are expressed in terms of the gradient of the Lagrangian functional $L$ defined in the previous section. In particular, in absence of control and/or state constraints the optimality conditions are given by

[TABLE]

and the discrete counterpart is

[TABLE]

Since the discrete approximation $(u_{h},q_{h},\lambda_{h})$ is accurate only up to a certain tolerance that depends on the actual mesh refinement, it makes sense for efficiency reasons to solve the optimality system only up to a certain accuracy as well.

The idea of our adaptive inexact Newton-type method is to balance the accuracy of the first order optimality conditions, i.e. of the KKT system, with the accuracy of its discrete approximation with respect to a goal functional, rather than with respect to some (global) norms of the solution or of the residuals. This is possible exploiting the flexibility of the DWR which allows to control the error with respect to an arbitrary functional.

In Section 3.1, we briefly introduce the DWR method and in Section 3.2 we explain how to split the error into two contributions: one from the mesh discretization and the other from the inexact solution of the KKT system.

3.1 Dual weighted residual (DWR) method

We are interested in estimating the error $e(u,q,\lambda)$ measured in a quantity of interest:

[TABLE]

Following the seminal work of Becker and Rannacher [6] we obtain the error identity by weighting the residual of the KKT system by an appropriate dual problem. Let $w=(u,q,\lambda)$ be the solution of the KKT system (2). For the DWR error representation we need the residual of the system, $\rho(w_{h})(\cdot):\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}\rightarrow\mathbb{R}$ , defined by

[TABLE]

with $\varphi=(\varphi^{u},\varphi^{q},\varphi^{\lambda})\in\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}$ . Furthermore, we need the following adjoint problem to define the error estimator

Problem 3.1 (Dual problem)

Find $z:=(z^{u},z^{q},z^{\lambda})\in\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}$ such that

[TABLE]

By setting $\delta w=(\delta u,\delta q,\delta\lambda)$ , the dual system reads

[TABLE]

with the adjoint bilinear form $\mathcal{A}^{*}(\cdot,\cdot)(\cdot):\bigl{(}\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}\bigr{)}^{3}\rightarrow\mathbb{R}$ defined as

[TABLE]

Its discretized counterpart is

Problem 3.2 (Discretized dual problem)

Find $z_{h}:=(z^{u}_{h},z^{q}_{h},z^{\lambda}_{h})\in\mathcal{V}_{h}\times\mathbb{R}^{s}\times\mathcal{V}_{h}$ such that

[TABLE]

Since the model problem is nonlinear in $q$ we need to define the following dual residual $\rho^{*}(w_{h},z_{h})(\cdot):\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}\rightarrow\mathbb{R}$ to derive the error estimator

[TABLE]

with $\psi=(\psi^{u},\psi^{q},\psi^{\lambda})\in\mathcal{V}\times\mathbb{R}^{s}\times\mathcal{V}$ .

With these definitions, following [4, Proposition 6.2], we get the error estimator

Theorem 3.1 (A posteriori error estimator)

Let $w$ , $w_{h}$ be the solutions of Problem 2.1 and 2.2 and let $z$ , $z_{h}$ be the solutions of the continuous dual problem 3.1 and its discretized version 3.2. It holds the error identity

[TABLE]

with the residual $\rho(w_{h})(\cdot)$ and the adjoint residual $\rho^{*}(w_{h},z_{h})(\cdot)$ defined in (3.1) and (3.1). The remainder term is given by

[TABLE]

where $\mathcal{A}$ is the semi-linear form (5) and the primal and dual errors are $e:=w-w_{h}$ and $e^{*}:=z-z_{h}$ .

Proof 3.1

The proof follows by application of Proposition 6.1 from [4] with the following Lagrange functional

[TABLE]

We sketch it here for later purposes. Introducing the notation $x=(w,z)$ and $x_{h}=(w_{h},z_{h})$ and reminding the definition of the semi-linear form $\mathcal{A}$ , see expression (5), we can rewrite it as

[TABLE]

Furthermore, it is

[TABLE]

where we have used the fact that $w$ and $w_{h}$ satisfy (6) and (7) respectively. Considering the relation

[TABLE]

the error identity follows from the error representation of the trapezoidal rule

[TABLE]

In fact, since $\mathcal{L}^{\prime}(x)(e)=0$ it is

[TABLE]

where $R$ is the remainder term of the trapezoidal rule. From this relation, using the definitions (3.1) and (3.1), the identity (12) can be deduced observing that

[TABLE]

3.2 Balancing of discretization and iteration error

In this work, we consider an inexact Newton-type method to solve the nonlinear KKT system (2.2). We introduce the notation $\widetilde{w}_{h}$ to indicate the inexact solution of the KKT system, which is obtained when the stopping criterion

[TABLE]

is reached and the notation $\widetilde{z}_{h}$ to indicate the “perturbed” dual solution obtained by solving exactly (up to machine precision) the “perturbed dual equation”

[TABLE]

We use the term “perturbed dual equation” for the adjoint equation in which we set the inexact primal solution $\widetilde{w}_{h}$ as coefficient.

Since $\widetilde{w}_{h}$ and $\widetilde{z}_{h}$ are approximations of $w_{h}$ and $z_{h}$ , an additional term appears in the error identity (12) that accounts for the inexact Galerkin projection (14).

Following [27, Proposition 3.1] we have the error estimator

Theorem 3.2 (Error estimator with inexact Galerkin projection)

[TABLE]

with the residuals of the primal problem (3.1) and of the dual problem (3.1) and the remainder term as in Problem 3.1.

Proof 3.2

Introducing the notation $x=(w,z)$ , $\widetilde{x}_{h}=(\widetilde{w}_{h},\widetilde{z}_{h})$ and the Lagrangian as in Theorem (3.1), the proof follows from [27, Proposition 3.1]. Let us consider the Lagrangian

[TABLE]

It follows that

[TABLE]

where we have used the fact that $w$ satisfies (6), while equation (7) is solved only approximately. Considering the trapezoidal rule and its remainder term, we get analogously to Theorem 3.1 the identity

[TABLE]

from which the error representation (15) can be deduced.

Definition 1 (Splitting of the error estimator)

For ease of presentation of the results and to derive the adaptive Newton strategy we split the error estimator into two parts. These are identified with the discretization error $\eta_{h}$ and the error due to the inexact Newton solution of the discrete KKT system $\eta_{KKT}$ :

[TABLE]

Furthermore, using the error identity (15) we define

[TABLE]

To evaluate the quality of the error estimator, we use the effectivity index

[TABLE]

An index close to one means that the estimator is reliable. In the numerical examples in Section 6, we will observe that the indicator $\eta_{h}$ has a good effectivity index already at the beginning of the Newton iterations, when the solution approximation is inaccurate.

We conclude this section by anticipating that our adaptive strategy defined in the algorithms in Section 5 exploits the error splitting and attempts to balance the two error contributions, i.e. to reach the balance $\eta_{h}\approx\eta_{KKT}$ , during the Newton iterations. In this way the adaptive strategy attempts to reduce the goal functional of the optimization problem on coarse meshes and it proceeds with mesh refinement only if the discretization error is dominating.

Remark 1

In the residual (3.1) the term related to the residuum of the control equation is always zero because the control is finite dimensional. We keep it in the error representation because the same term in the residual $\rho(\widetilde{w}_{h})(\widetilde{z}_{h})$ in (15) is nonzero due to the inexact Galerkin projection. In fact, this term is essential to get a reliable error estimator. Also in the dual residual (3.1) the control term is zero. We keep the full expression here for the sake of completeness in the case of an infinite dimensional control space.

4 Application: Optimal electrode design

As already mentioned in the introduction, we apply our adaptive strategy to the optimal design of a micro-pipette to be used in electroporation. The objective is to maximize the area around the micro-pipette, where the voltage exceeds a certain threshold $\overline{u}$ , while on the other hand an upper bound for the voltage $u_{\infty}$ shall not be reached.

The micro-pipette is covered by an isolating material such that the current can only flow to the biological tissue through the micro-pipette holes. While standard micro-pipettes have only one hole at the tip, holes can also be created on the sides of the micro-pipette using nanoengineering techniques [31].

As some of the parameters, as e.g. the conductivity of the medium $\sigma$ , are only known in a very rough approximation, we cannot expect to obtain quantitative results at this stage. Therefore, it seems justified for a qualitative study to restrict the setting to a two-dimensional simplification. Furthermore, we restrict the possible design of the holes to a symmetric setting with the same size of the openings on both sides. In this context we are interested in the position and size of the openings as design parameters.

The domain of interest is a region around the micro-pipette $\Omega^{s}\subset\mathbb{R}^{2}$ , where neuronal cells might be activated. To reduce the influence of exterior boundary conditions, we use a larger box $\Omega\supset\Omega^{s}$ around the micro-pipette as simulation domain (Figure 3), excluding the micro-pipette itself. If we choose the box $\Omega$ large enough, we can assume without loss of generality zero voltage at the outer boundary.

We assume that a fixed current $\overline{I}$ is applied at the top of the micro-pipette. Knowing the resistances of the electrode, the approximation of the fluxes through the holes of the micro-pipette can be derived by physical laws (see the appendix) and are defined as flux (Neumann) conditions at the boundary of $\Omega$ . The fluxes are expressed as functions of the radii and positions of the holes (which are the control variables for the optimal design), therefore the control variables define the Neumann boundary conditions as a finite dimensional parametrization.

4.1 Governing equations

The governing equations can be derived as follows. In the absence of electric charge the Gauss law states

[TABLE]

for the electric field $E:\Omega\to\mathbb{R}$ and the conductivity $\sigma\in\mathbb{R}_{+}$ . With the electrostatic potential $\varphi$ ( $E=-\nabla\varphi$ ) (18) reads

[TABLE]

We assume zero Dirichlet conditions at the outer boundaries of the box denoted by $\Gamma^{d}$ , see Figure 3. Then, we can rewrite (19) in terms of the voltage $u$

[TABLE]

The boundary condition at the micro-pipette is a flux condition

[TABLE]

The flux $g$ is zero at the isolated parts $\Gamma^{\rm iso}$ of the micro-pipette. At the holes $\Gamma_{0},\dots,\Gamma_{n}$ , the flux is given by

[TABLE]

where $J_{k}$ denotes the current density at hole $\Gamma_{k}$ . Let us assume that the number of holes is fixed. Our design parameters $q\in\mathbb{R}^{s}$ will be the vertical position in terms of the midpoint $m_{k}$ of the holes and their sizes $s_{k},k=0...s/2$ . The fluxes $J_{k}$ depend in a highly nonlinear way on $q$ (see the appendix). As the derivation of the analytical expressions $J_{k}(q)$ for more than two sets of holes are complex, we restrict our study to a maximum of two additional sets of holes besides the hole at the tip. A possible extension of this work would be to not rely on analytical expressions for the fluxes but extend the computational domain to the interior part of the micro-pipette and approximate the fluxes with a finite element discretization. Since the restriction to two sets of holes is not a limitation to show the effectiveness of our approach, we consider the analytical expressions derived in the appendix to reduce the computational effort.

4.2 Objective functional

Our aim is to maximize the region where the voltage exceeds a certain threshold $\overline{u}$ . At the same time, we have to ensure that the voltage does not exceed a critical value $u_{\infty}\gg\overline{u}$ , with which the biological tissue might be damaged. The corresponding objective functional would be

[TABLE]

where $\chi_{M}$ denotes the characteristic function of the set $M={\{x\in\Omega\,|\,u(x)\geq\overline{u}\}}$ . The main issue of the objective functional $J_{\chi}$ is its non-differentiability. For the gradient-based optimization algorithm we present here, the functional is required to be at least differentiable.

Therefore, we consider another objective functional

[TABLE]

where we choose a constant function $\hat{u}>\overline{u}$ that is used as a tracking term to reach the desired threshold. Moreover, to avoid the influence of a far-away region, where we cannot expect that any cell can be activated, we consider for the functional definition the previously defined domain of interest, i.e. the sub-domain $\Omega^{s}\subset\Omega$ around the micro-pipette.

Using this functional, the voltage $u$ does not exceed the critical value $u_{\infty}$ in the numerical simulations conducted for this paper. In fact, we have found that using a value $\hat{u}$ slightly larger than the threshold $\overline{u}$ is a good choice to get above to the threshold and to stay significantly below the critical value $u_{\infty}$ .

The application poses additional restrictions for the design parameters $q$ . Each hole has to lie above the tip of the micro-pipette and below the upper end of the bounding box $y_{\rm tip}+s_{k}<m_{k}<y_{\rm up}-s_{k}$ and the holes should not overlap $m_{k}+s_{k}<m_{k+1}-s_{k+1}$ (otherwise the formulas derived in the appendix for the fluxes at the boundary are not valid anymore). In the numerical experiments conducted for this paper, however, these conditions were never violated. The addition of point-wise state constraints $u\leq u_{\infty}$ and/or control constraints $q\in Q_{\rm ad}\subset\mathbb{R}^{s}$ , using an admissible set $Q_{\rm ad}$ , can be done without significant changes in our approach using a penalty method and/or an active set strategy [21]. Hence to simplify the exposition, we do not incorporate state and control constraints in this work.

Finally, we add a regularization term with parameter $\alpha>0$ to the objective functional as explained in Section 2. The optimization problem reads in variational formulation

Problem 4.1 (Optimization of the micro-pipette)

Find the pair $u\in\mathcal{V}=H^{1}_{0}(\Omega;\Gamma^{d})$ and $q\in\mathbb{R}^{s}$ that minimizes the goal functional $J$ under the PDE constrain:

[TABLE]

4.3 Karush-Kuhn-Tucker system

The Lagrange functional corresponding to problem 4.1 reads

[TABLE]

with an adjoint variable $\lambda\in\mathcal{V}=H^{1}_{0}(\Omega;\Gamma^{D})$ . The first-order optimality conditions are given by:

Problem 4.2 (First-order optimality conditions)

Find $u\in\mathcal{V}$ , $q\in\mathbb{R}^{s}$ , $\lambda\in\mathcal{V}$ , such that

[TABLE]

4.4 Discretization and approximation of the flux boundary conditions

We use the space $\mathcal{V}_{h}$ of standard $Q_{1}$ finite elements on a quasi-uniform finite element mesh $\mathcal{T}_{h}$ . Altogether, the discrete optimality system reads:

Problem 4.3 (Discrete first-order optimality conditions)

Find $u_{h}\in\mathcal{V}_{h}$ , $q_{h}\in\mathbb{R}^{s}$ , $\lambda_{h}\in\mathcal{V}_{h}$ such that

[TABLE]

Denoting by $y$ the vertical position on the micro-pipette (see Figure 3) and by $y_{\rm tip}$ the position of the tip, it holds for the flux function $g$ that

[TABLE]

Note that $g$ is discontinuous in both the vertical coordinate $y$ and the parameter vector $q$ . This is a problem, since the derivative $g^{\prime}(q)$ appears in the optimality system (4.3). Furthermore, numerical methods like Newton-type methods for solving (4.3) require at least the first derivative of the system which includes $g^{\prime\prime}(q)$ . To deal with this issue, we introduce a smooth approximation of $g$ . Let $\chi_{[-1,1]}$ be the characteristic function on the interval $[-1,1]$ . A smooth approximation to $\chi_{[-1,1]}$ is given by $\chi_{[-1,1]}\approx\exp(-x^{2\beta})$ . Based on this approximation, we define a regularized flux function $\tilde{g}$ by

[TABLE]

for some $\beta\in\mathbb{N}$ , see Figure 4.

Furthermore, we use a summed quadrature formula with sufficient integration points to evaluate the boundary integrals such that the decay of $\tilde{g}$ at the boundary of the openings is appropriately approximated.

4.5 Dual problem for the error estimation

As explained in Section 3.1 to estimate the discretization and iteration errors we need an approximation of the solution of an ad-hoc dual problem. In the specific case of the micro-pipette optimization the dual problem reads

Problem 4.4 (Dual micro-pipette problem)

Find $z^{u},z^{\lambda}\in H^{1}_{0}(\Omega;\Gamma^{\text{d}})$ and $z^{q}\in\mathbb{R}^{s}$ such that

[TABLE]

This problem is discretized with finite elements to get the approximation $z_{h}=(z^{u}_{h},z^{\lambda}_{h},z^{q}_{h})$ . Furthermore, we observe that the system matrix of the dual problem is exactly the same matrix used in the Newton method to solve the primal problem, i.e. to solve the discrete KKT system (4.3). It is the Hessian of the Lagrange functional (21). It follows that the solution of the dual problem for the DWR method corresponds to one additional Newton step with a different right-hand side, which will be explicitly shown later in the algorithmic section 5.

4.6 A posteriori error estimators

We conclude this section by specifying the concrete error estimators for the KKT system (4.3). Following the derivation in Section 3, it holds that

[TABLE]

with $\eta_{h}$ and $\eta_{KKT}$ specified in (16) and (17). An evaluation of the integrals over the cells leads in general to poor local error indicators due to the oscillatory nature of the residuals, see [11]. To avoid this behavior we integrate the residuals cell-wise by parts obtaining boundary terms (jump terms) to distribute the error on the inner cell-edges. To simplify the notation we use the symbols without tilde implicitly considering that all quantities $w_{h}$ and $z_{h}$ are perturbed. Furthermore, we separate in (16) the contribution from the primal and the dual residuals, i.e. $\eta_{h}=\eta_{h}^{p}+\eta_{h}^{d}$ with

[TABLE]

with the boundary residuals $r_{h}(\cdot)$ defined on $V_{h}$ by

[TABLE]

In the error representation (15) both the continuous primal and dual solutions $w$ and $z$ are used as weights. In fact, using the continuous weights and considering the remainder term, this expression is an identity. It becomes an estimation after substituting the unknown continuous solutions with computable quantities. As shown in several applications [9, 30, 29, 26, 10, 5] one efficient method to produce computable quantities is the use of a patch-wise higher-order approximation of the terms $z_{h}$ and $w_{h}$ . In particular, we have used the following interpolation operator with $v_{h}\in V_{h}$

[TABLE]

where $I_{2h}^{(2)}:V_{h}\to V_{2h}^{(2)}$ denotes the nodal interpolation into the space of quadratic polynomials on the patch mesh $\mathcal{T}_{2h}$ obtained by joining together four cells patchwise, as shown in Figure 5.

In the case of uniformly refined meshes and smooth primal and dual solutions, this weight-approximation has been justified analytically in [4].

The estimator for the iteration error is defined by

[TABLE]

5 Algorithms

In this section we introduce the algorithms that are compared in Section 6. In addition to the fully adaptive algorithm, we will also specify a global refinement strategy and a purely mesh adaptive algorithm. The latter is the standard Dual Weighted Residual method applied to (4.2). The residual $\rho^{k}$ of the k-th iterate $w^{k}$ is defined by

[TABLE]

The corresponding Hessian matrix is given by

[TABLE]

Let a tolerance $TOL_{KKT}$ for the Newton residual $\rho^{k}$ be given. We formulate the algorithms for a damped Newton method with a damping parameter $\alpha_{N}\in(0,1]$ . The full Newton step is obtained for $\alpha_{N}=1$ .

In Algorithm 1 we present the global refinement strategy with a given number of refinement steps $n_{ref}$ .

In this algorithm, neither the discretization nor the iteration error is estimated. Thus, we have no control over these errors and the usual stopping criterion is based on a tolerance on a norm of the residual. This generic criterion does not allow to control the inexactness of the optimal solution that is needed to advance with the optimization on finer grids before machine precision is reached.

Next, we introduce the purely mesh adaptive Algorithm 2. We introduce a further tolerance $TOL$ for the discretization error. In addition to the notation introduced above, we denote the right-hand side of the dual problem by

[TABLE]

Based on the error indicators several refinement strategies can be derived. We refer to [6] for an overview. In this work we use a refinement strategy based on a minimization of the expected error and the computational effort required for the solution on the refined mesh, see [28].

Finally, we concretize the fully adaptive Algorithm 3 which is the novelty of this contribution.

It remains to specify the constant $c_{b}$ in Algorithm 3 to determine the balancing between $\eta_{h}^{k}$ and $\eta_{KKT}^{k}$ . A straightforward choice would be $c_{b}=1$ , i.e. to stop the Newton iteration, as soon as the iteration error is smaller than the discretization error. Nevertheless, we have decided in the numerical examples conducted for this paper to use a smaller value of $c_{b}$ . In this way, we obtain that the Newton method remains in the region of quadratic convergence on the next finer grid, once this convergence rate is achieved on the previous coarser one. In the numerical examples below, we have used $c_{b}=0.1$ . In general, the optimal choice for $c_{b}$ depends on the specific application.

On the first sight, the two adaptive algorithms look very similar. The main difference is the stopping criterion of the second while-loop which depends on the balancing of the error estimators in the fully adaptive algorithm and on the Newton residual in the mesh-adaptive algorithm. The balancing criterion allows the Newton method to iterate on the actual grid as much as needed to reach the discretization error, therefore it saves at each level unneeded iterations. However, it has to be noted that in the full adaptive algorithm a dual problem has to be solved within the second while-loop in every iteration, while in the mesh-adaptive algorithm a dual problem is solved only once on each mesh level. We investigate in the next section how this additional computational effort compares to the computational savings due to the improved stopping criterion.

6 Numerical results

In this section, we study different numerical examples to test the algorithms. First, we study a simple test problem on a slit domain in Section 6.1. The purpose of this test problem is to test the error estimators $\eta_{h}$ and $\eta_{KKT}$ . To this end, we compute effectivity indices and investigate if the estimators are relatively independent of each other. Then we test the algorithms for optimal electrode design in Section 6.2. We compare the fully adaptive algorithm to the two other refinement algorithms introduced in Section 5 with respect to errors, degrees of freedom and computational times. Finally, we compare the optimal results for 0, 1 and 2 sets of holes in addition to the opening at the tip.

All the numerical results presented in this section are obtained using the finite element library deal.II [3]. To calculate the lengthy first and second derivatives of $g(q)$ with respect to the design parameter $q$ (see the appendix) we have used the automatic differentiation tool ADOL–C [35]. The runtimes shown were obtained on a desktop computer with a 2.66GHz Intel Core 2 Quad processor (Q9400). As linear solver within the Newton algorithm, a direct solver is applied (UMFPACK [12]).

6.1 Test problem on a slit domain

We start by studying a test problem on the slit domain

[TABLE]

see Figure 6 on the right. We set homogeneous Dirichlet data on the outer boundary except for the upper part $\Gamma^{\rm top}$ , where the Neumann condition $\partial_{n}u=q^{2}\pi\sin(\pi x)$ is imposed. As objective functional, we set $J(u,q)=\frac{1}{2}\int_{\Omega}(u-u_{0})^{2}dx+\frac{\alpha}{2}q^{2}$ , where $u_{0}=\frac{1}{\sigma}\sin(\pi x)\sin(\pi y)$ and $\sigma=1.72$ . The quantity of interest is $\mathcal{I}(u,q)=q^{2}$ .

To test the estimator $\eta_{h}$ , we first show results of the global refinement strategy in Figure 6 on the left. The optimal solution $u$ shows a singularity in the gradient at the top of the slit. Therefore, the error in the goal functional ${\cal I}(q)$ as well as the error estimator decrease linearly as expected (see e.g. [13], Section 6.3.2). As the Newton algorithm is solved up to a small tolerance $\rho^{k}<10^{-10}$ in the Newton residual, it holds $\eta\approx\eta_{h}$ . The effectivity indices $I_{\rm eff}=\eta/(I(q)-I(q_{h}))$ on the finer grids are $\eta\approx 0.32$ . These values show that the error estimator estimates the order of magnitude of the error correctly, the deviation of estimator and real error being mainly due to the approximation of the weight $z-z_{h}$ by the interpolation $I_{2h}^{2}z_{h}-z_{h}$ .

In Figure 7, we compare the results obtained by global mesh refinement to the adaptive strategy introduced in Section 5. We plot the error in ${\cal I}(q)$ against the degrees of freedom $N$ (left) and against the computational times (right). The error for the two adaptive methods differ only marginally, hence we plot only the fully adaptive result in the left plot. In this example the error decreases approximately with $\mathcal{O}(N^{-1/2})=\mathcal{O}(h)$ for global refinement and with $\mathcal{O}(N^{-1})$ for the adaptive mesh refinement. While the global refinement strategy shows for example an error of $7.4\times 10^{-4}$ for $526.850$ degrees of freedom, the adaptive strategy reaches an error of $5.6\cdot 10^{-4}$ with only $7.722$ degrees of freedom. The solution $u_{\rm opt}$ on an adaptively refined mesh is shown in Figure 8 on the right.

With respect to computational times, the fully adaptive and the mesh-adaptive algorithm show asymptotically the same convergence behavior, with a clear advantage of the fully adaptive algorithm. To study this difference quantitatively, we show the number of Newton steps and the computational times in Figure 8 on the left.

From the third mesh level, the mesh-adaptive algorithm needs two Newton steps on each mesh level, while the fully adaptive one needs only one. As mentioned in Section 5 the latter strategy requires the solution of a dual problem after each Newton step, while in the mesh-adaptive algorithm a dual problem has to be solved only once after the last Newton step on each mesh level. As the system matrix for the primal and dual problem have the same structure, the cost to solve them is comparable. Hence, two KKT systems have to be solved in the fully adaptive algorithm on each of the finer mesh levels, compared to three for the mesh adaptive algorithm. This ratio of 2:3 can be observed in the computational times. To reduce the error in ${\cal I}(q)$ below $10^{-4}$ , the fully adaptive algorithm needs for example 131 seconds in contrast to 193 s for the mesh adaptive one. Using global refinement, 1157 s are needed to reach this tolerance, see Figure 6.

Finally, we want to test the iteration error indicator $\eta_{KKT}$ . As the iteration error of the Newton method decreases very quickly in the first Newton steps, this is barely visible in the previous calculations. To investigate the iteration error indicator in more detail, we use a damped Newton iteration with a damping factor that reduces the convergence rate of the iteration significantly.

Given an iterate $x^{k}$ , the next iterate is defined by

[TABLE]

where $\delta x$ is the Newton direction and the damping parameter is chosen as $\alpha_{N}=0.5$ . To keep the discretization error small, we choose a very fine mesh with $526.850$ nodes. The results are shown in Table 1. We observe that the discretization error $\eta_{h}$ varies slightly in the first iterates by less than 25% and stays then nearly constant. This indicates that the two error estimators are asymptotically independent. Moreover, we see that the iteration error dominates the overall error until the ninth iteration. Up to then, the effectivity index lies between $0.87$ and $1.16$ . This indicates that the iteration error indicator is very reliable in this test problem. After about 14 iterations, the discretization error becomes dominant and the efficiency indices are about $0.32$ as observed in the previous test. At a certain iteration the two errors are comparable but with opposite sign, i.e. $\eta_{KKT}\approx-\eta_{h}$ . A cancellation problem occurs and the efficiency index in step 11 becomes negative. This is typical when trying to split the error in different contributions and cannot be avoided.

As last example in this section, we study a problem in which the nonlinearity causes more Newton steps to test the mesh-adaptive algorithm. Therefore, we introduce a further non-linearity in the state equation

[TABLE]

We set the right-hand side to $f(x,y)=2\pi^{2}\sin(\pi x)\sin(\pi y)$ and use the same objective functional $J(u,q)$ and the quantity of interest $\mathcal{I}(q)$ as above. We compare the number of Newton steps and the computational times of the fully adaptive and the mesh-adaptive algorithm in Table 2. On the finer mesh levels, the mesh-adaptive algorithm requires 5 Newton steps per mesh level, which means that 6 KKT systems have to be solved, while the fully adaptive algorithm refines the mesh after 2 Newton steps, i.e. 4 KKT systems have to be solved. Therefore, we observe again a ratio of roughly 2:3 in the computational times.

6.2 Optimal electrode design

In this section we consider the micro-pipette geometry described in Section 4. We use the Neumann boundary condition

[TABLE]

with $J_{k}(q)$ as described in the appendix (40) and (39). The optimization problem is given in (20), the continuous and discrete KKT system in (4.2) and (4.3). First, we study the configuration with two openings on each side and keeping the sizes $s_{1}$ and $s_{2}$ fixed. The design parameters are thus the vertical position of the holes, in terms of their midpoints $m_{1}$ and $m_{2}$ .

The objective functional is given by

[TABLE]

where $\hat{u}=5$ , $\alpha=10^{-8}$ and a sub-domain $\Omega^{s}$ as shown in Figure 3. The domain $\Omega$ has a size of $40\mu m\times 60\mu m$ , the sub-domain $\Omega^{s}$ of $32\mu m\times 35\mu m$ and the sizes of the openings are fixed as $s_{0}=1.5\mu m,s_{1}=1\mu m,s_{2}=2\mu m$ . Moreover, the thickness of the micro-pipette wall is $d=0.5\mu m$ , the inclination angle of the micro-pipette is $\theta=22^{\circ}$ , the conductivity $\sigma\approx 1.72(\Omega m)^{-1}$ and the applied current on the top of the micro-pipette is $\overline{I}=50\mu A$ . We approximate the function $g$ by the differentiable function $\tilde{g}$ defined in (24) with $\beta=2$ . As goal functional, we consider again the error in the design parameter $\mathcal{I}(u,q)=q^{2}$ .

In Figure 9 we show the error estimators $\eta,\eta_{h}$ and $\eta_{KKT}$ as well as the effectivity index $I_{eff}$ in the course of the fully adaptive algorithm on the left side. On each mesh level one Newton step was enough to reduce the iteration error $\eta_{KKT}$ below the discretization error $\eta_{h}$ . On the other hand, $\eta_{KKT}$ is around $10^{-3}$ on the coarse mesh levels and the corresponding Newton residual $\rho_{Newton}$ is far away from being below the tolerance $TOL_{KKT}$ . Therefore, the purely mesh-adaptive algorithm needs more Newton steps (2-4) on each mesh level, see Table 3. Note that although the contribution of the iteration error to the goal functional is very small from the fourth mesh level ( $\eta_{KKT}<10^{-9}$ ) the Newton residual $\rho^{k}$ might still be much larger, such that the purely mesh adaptive algorithm needs at least a second Newton step before $\rho_{Newton}<TOL_{KKT}$ .

The effectivity indices are close to 1 on all fine mesh levels. The error is well estimated on all mesh levels besides the third one with 3670 nodes. Here, the error $\mathcal{I}(q)-\mathcal{I}(q_{h})$ changes its sign, which is not yet captured by the estimator. On the right, we show the optimal solution on an adaptively refined mesh. Note that most of the refinement takes place around the two upper openings of the micro-pipette. The optimal positions of the openings found by the adaptive algorithm for initial values $m_{1}^{0}=10\mu m$ and $m_{2}^{0}=20\mu m$ are $m_{1}=6.1\mu m$ and $m_{2}=15.8\mu m$ above the tip of the micro-pipette, see also Figure 2.

In Table 3, we compare the mesh adaptive against the fully adaptive algorithm in terms of Newton steps and computational times. As mentioned before, the iteration error $\eta_{KKT}$ is reduced below the discretization error $\eta_{h}$ already after the first Newton step in the fully adaptive algorithm, while 2 to 4 Newton steps are necessary in the mesh-adaptive algorithm to reduce the Newton residual below the tolerance $TOL_{KKT}=10^{-10}$ . On the finer meshes we have to solve 2 primal and 1 dual KKT systems for the mesh adaptive algorithm and 1 primal and 1 dual system for the fully adaptive algorithm. Thus, the computational times show again a ratio of roughly 2:3 on the finer meshes.

On the left side of Figure 10, we compare the global refinement algorithm against the adaptive ones by plotting the error against degrees of freedom. As the plots of the two adaptive algorithms are indistinguishable, we plot again only the errors for the fully adaptive algorithm. The global refinement strategy converges with a rate slightly smaller than ${\cal O}(N^{-1})={\cal O}(h^{2})$ , while the adaptive algorithms converge significantly faster. On the right side, we plot the error $|\mathcal{I}(q)-\mathcal{I}(q_{h})|$ over the computational time for the two adaptive algorithms. Due to the observations made above for the number of KKT systems to be solved, the two adaptive algorithms show again a very similar asymptotic behavior in terms of computational times, with an advantage of roughly $33\%$ for the fully adaptive algorithm.

Finally, we want to compare the effect of a different number of openings. In the case of only one opening at the bottom, nothing is to be optimized. Solving the state equation to obtain the electric field yields an objective value of $J(u)\approx 11360$ . The voltage distribution is shown in Figure 11 on the left. The black contour line corresponds to the threshold $\overline{u}=4$ .

For a fair comparison of micro-pipettes with one and two sets of openings, we have to optimize not only the position $m_{k}$ , but also the size $s_{k}$ of the openings. Due to the complicated structure of the boundary fluxes $g_{k}(q)$ (see the appendix), we decided not to implement the additional derivatives that would be necessary for a simultaneous optimization of size and position. Instead, we alternately optimize size and position by keeping the respective other parameters fixed, see Table 4 for the case of two openings.

The optimal parameters the optimization algorithm has found were sizes of $s_{1}=0.35\mu m$ , $s_{2}=0.3\mu m$ and positions of $m_{1}=5.9\mu m$ and $m_{2}=19.7\mu m$ in the case of two openings and $s_{1}=0.31\mu m$ and $m_{1}=18.2\mu m$ for one opening. The optimal functional value is given by $J(u,q)\approx 6217$ for one set of holes and $J(u,q)\approx 6018$ for two sets. While we obtain a reduction of more than $45\%$ between the case without a hole and the case with one set of holes, the reduction between one and two sets of holes is only around $3.2\%$ . This can also be seen from the voltage distribution in Figure 11.

These results show that the modified micro-pipettes yield a significantly larger region where cell membranes are made permeable. Moreover, the results indicate that more than one set of holes does not bring a significant advantage anymore, as a relatively large region around the micro-pipette is activated already by one hole per side. The second hole that is placed relatively close to the tip of the micro-pipette by the optimization algorithm, seems to have a much smaller effect. However, the overall voltage distribution is more uniform and the peak potential regions at the holes (red) are reduced, which may have an advantageous effect for the health of cells (without losing any/much of the overall electroporation volume).

7 Conclusion & Outlook

We have presented an adaptive optimization algorithm for the optimal design of a micro-pipette for electroporation used for neuronal networks tracing. The main contribution is the derivation of the goal-oriented strategy that allows to steer the number of Newton steps balancing the discretization error and the solution error with respect to a (nonlinear) functional that represents a quantity of interest for the specific optimization problem.

Possible extension of this approach is the balance of the linearization error and the error due to the linear solver as in [27]. Furthermore, more sophisticated optimization algorithms as the one presented in [37] can be considered in this framework to increase the robustness of the optimization method. In addition, globalization techniques as the one presented in [1] that allow to control the convergence behavior of the Newton method become essential in certain practical cases and should be considered in future works.

Another possible extension is to include control and/or state constraints in the formulation. This is possible, as already mentioned, without significant changes in the approach presented here. Furthermore, the algorithm presented can be used and adapted to many other applications for which a Newton-type method can be applied.

Using the adaptive algorithm we have shown that quantitative improvement of the micro-pipette design can be obtained by a model-based optimization method. This approach is a promising tool with a significant impact in the neurosciences.

8 Appendix

In the appendix, we derive the flux function $g$ and its dependency on sizes $s_{k}$ and positions $m_{k}$ of the $k$ -th hole, $k=0...s/2$ . A scheme of the electric circuit is shown in Figure 12. For simplicity, we present only the case of a micro-pipette with two holes on each side. The derivation of corresponding formulas for a different number of holes is analogous.

We assume that a fixed current $\overline{I}$ is applied at the top of the micro-pipette. We calculate the current $I_{k}$ that flows out of the micro-pipette at the holes $k=0,\dots,2$ . The flux function $g_{k}$ at hole $k$ is then given by the current density $J_{k}$

[TABLE]

The micro-pipette is filled with a conducting liquid. We assign a specific resistance $R_{j}$ , $j=0,\dots,3$ , to each of the parts of the micro-pipette. The resistances of the conducting liquid in the small holes on the left and right in between the isolating wall are denoted by $R_{1}$ and $R_{2}$ , the resistances of the parts in the interior of the micro-pipette by $R_{3}$ and $R_{0}$ . Denoting the thickness of the wall by $d$ , the resistance of a hole is given by

[TABLE]

where $\rho=1/\sigma$ is the electrical resistivity. To calculate the resistance of the conical part below $x_{2}$ , we introduce the notation $a(x)$ for the area inside the micro-pipette at position $x$ , see Figure 13. The area $a_{0}=a(0)$ of $\Gamma_{0}$ at the tip of the micro-pipette is given by $a(0)=\pi s_{0}^{2}$ , the area $a_{1}=a(m_{1})$ of $\Gamma_{1}$ at the first hole by

[TABLE]

where $\theta$ is the inclination angle of the micro-pipette. The resistance of the conical part below $x_{2}$ is then given by (see e.g. [17])

[TABLE]

Similarly, we get for the resistance $R_{3}$ of the part between the points $x_{1}$ and $x_{2}$

[TABLE]

The voltage difference between point $x_{2}$ and a point $\hat{x}$ far off the micro-pipette can be used to derive the following formula by using Ohm’s law (see Figure 12)

[TABLE]

Here, $R_{0,1,3}$ stands for the total resistance of the parts $R_{0}$ , $R_{1}$ and $R_{3}$ which is given by

[TABLE]

Furthermore, by Kirchhoff’s current law the current $\overline{I}$ splits at point $x_{2}$ to

[TABLE]

(35) and (36) can be solved for the two unknowns $I_{2}$ and $I_{3}$ . In the same way, it holds at point $x_{1}$

[TABLE]

and

[TABLE]

Given $I_{3}$ , (37) and (38) define $I_{0}$ and $I_{1}$ . Inserting the formulas for the resistances, a direct calculation results in

[TABLE]

with $c:={cot}(\theta)$ and

[TABLE]

As we use a two dimensional setting the size of $\Gamma_{k}$ is given by $|\Gamma_{k}|=s_{k}$ . The flux function $g_{k}$ at hole $k$ is thus given by

[TABLE]

Acknowledgements

We are grateful to Prof. Andreas T. Schaefer for generous collaborative support and fruitful discussions of this work.

T.C. was supported by the Deutsche Forschungsgemeinschaft (DFG) through the project CA 633/2-1.

Bibliography37

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Amrein and T. P. Wihler. Fully adaptive Newton–Galerkin methods for semilinear elliptic partial differential equations. SIAM Journal on Scientific Computing , 37(4):A 1637–A 1657, 2015.
2[2] I. Babuška, J. R. Whiteman, and T. Strouboulis. Finite elements. An introduction to the method and error estimation. Oxford: Oxford University Press, 2011.
3[3] W. Bangerth, R. Hartmann, and G. Kanschat. deal.II – a general purpose object oriented finite element library. ACM Trans. Math. Softw. , 33(4):24/1–24/27, 2007.
4[4] W. Bangerth and R. Rannacher. Adaptive Finite Element Methods for Differential Equations . Birkhäuser Verlag, 2003.
5[5] R. Becker, M. Braack, D. Meidner, R. Rannacher, and B. Vexler. Adaptive finite element methods for pde-constrained optimal control problems. In W. Jäger, R. Rannacher, and J. Warnatz, editors, Reactive Flows, Diffusion and Transport , pages 177–205. Springer Berlin Heidelberg, 2007.
6[6] R. Becker and R. Rannacher. An optimal control approach to a posteriori error estimation in finite element methods. Acta Numerica , 10:1–102, 2001.
7[7] C. Bernardi, J. Dakroub, G. Mansour, and T. Sayah. A posteriori analysis of iterative algorithms for a nonlinear problem. Journal of Scientific Computing , 65(2):672–697, 2015.
8[8] M. Braack and A. Ern. A posteriori control of modeling errors and discretization errors. Multiscale Modeling & Simulation , 1(2):221–238, 2003.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

An adaptive Newton algorithm for optimal control problems

Abstract

1 Introduction

2 Optimization problem

2.1 Model problem

Problem 2.1** (KKT system of the model problem)**

2.2 Discretization

Problem 2.2** (Discrete KKT system of the model problem)**

3 Adaptive strategy

3.1 Dual weighted residual (DWR) method

Problem 3.1** (Dual problem)**

Problem 3.2** (Discretized dual problem)**

Theorem 3.1** (A posteriori error estimator)**

Proof 3.1

3.2 Balancing of discretization and iteration error

Theorem 3.2** (Error estimator with inexact Galerkin projection)**

Proof 3.2

Definition 1** (Splitting of the error estimator)**

Remark 1

4 Application: Optimal electrode design

4.1 Governing equations

4.2 Objective functional

Problem 4.1** (Optimization of the micro-pipette)**

4.3 Karush-Kuhn-Tucker system

Problem 4.2** (First-order optimality conditions)**

4.4 Discretization and approximation of the flux boundary conditions

Problem 4.3** (Discrete first-order optimality conditions)**

4.5 Dual problem for the error estimation

Problem 4.4** (Dual micro-pipette problem)**

4.6 A posteriori error estimators

5 Algorithms

6 Numerical results

6.1 Test problem on a slit domain

6.2 Optimal electrode design

7 Conclusion & Outlook

8 Appendix

Acknowledgements

Problem 2.1 (KKT system of the model problem)

Problem 2.2 (Discrete KKT system of the model problem)

Problem 3.1 (Dual problem)

Problem 3.2 (Discretized dual problem)

Theorem 3.1 (A posteriori error estimator)

Theorem 3.2 (Error estimator with inexact Galerkin projection)

Definition 1 (Splitting of the error estimator)

Problem 4.1 (Optimization of the micro-pipette)

Problem 4.2 (First-order optimality conditions)

Problem 4.3 (Discrete first-order optimality conditions)

Problem 4.4 (Dual micro-pipette problem)