A multi-level ADMM algorithm for elliptic PDE-constrained optimization   problems

Xiaotong Chen; Xiaoliang Song; Zixuan Chen; Bo Yu

arXiv:1908.04652·math.OC·August 14, 2019·Comput. Appl. Math.

A multi-level ADMM algorithm for elliptic PDE-constrained optimization problems

Xiaotong Chen, Xiaoliang Song, Zixuan Chen, Bo Yu

PDF

Open Access

TL;DR

This paper introduces a multi-level ADMM algorithm for efficiently solving elliptic PDE-constrained optimization problems with box constraints, combining mesh refinement, inexact subproblem solutions, and convergence guarantees.

Contribution

It proposes a novel multi-level ADMM method with mesh refinement and inexact subproblem solving, along with convergence analysis and complexity results.

Findings

01

The mADMM algorithm converges with a rate of O(1/k).

02

Numerical experiments demonstrate high efficiency of the proposed method.

03

The multi-level strategy improves computational performance over fixed mesh approaches.

Abstract

In this paper, the elliptic PDE-constrained optimization problem with box constraints on the control is studied. To numerically solve the problem, we apply the 'optimize-discretize-optimize' strategy. Specifically, the alternating direction method of multipliers (ADMM) algorithm is applied in function space first, then the standard piecewise linear finite element approach is employed to discretize the subproblems in each iteration. Finally, some efficient numerical methods are applied to solve the discretized subproblems based on their structures. Motivated by the idea of the multi-level strategy, instead of fixing the mesh size before the computation process, we propose the strategy of gradually refining the grid. Moreover, the subproblems in each iteration are solved inexactly. Based on the strategies above, an efficient convergent multi-level ADMM (mADMM) algorithm is proposed. We…

Tables2

Table 1. Table 1 : The convergence behavior of our mADMM, the ihADMM and the classical ADMM for Example 4.1 .

h	$# dofs$	$E$	EOC	Index	mADMM	ihADMM	classical ADMM
$2^{- 5}$	635	0.00168	-	residual $η$	8.46e-07	9.29e-07	8.32e-07
				CPU times/s	0.21	0.61	0.95
				$#$ iter	14	27	63
$2^{- 6}$	2629	5.57e-04	1.5927	residual $η$	6.90e-07	8.78e-07	2.47e-07
				CPU times/s	0.43	1.73	2.13
				$#$ iter	15	26	30
$2^{- 7}$	10697	2.05e-04	1.4420	residual $η$	7.14e-07	8.83e-07	9.03e-07
				CPU times/s	1.23	7.40	30.46
				$#$ iter	13	25	54
$2^{- 8}$	43153	1.13e-04	0.8593	residual $η$	4.29e-07	7.16e-07	6.93e-07
				CPU times/s	4.45	43.13	755.84
				$#$ iter	13	25	119
$2^{- 9}$	173345	6.68e-05	0.7584	residual $η$	1.22e-07	8.69e-07	2.68e-07
				CPU times/s	39.15	384.39	39646.84
				$#$ iter	14	24	380
$2^{- 10}$	694849	-	-	residual $η$	8.7e-08	9.29e-07	3.58e-05
				CPU times/s	553.37	6451.98	289753.72
				$#$ iter	15	23	500

Table 2. Table 2 : The convergence behavior of our mADMM, the ihADMM and the classical ADMM for Example 4.2 .

h	$# dofs$	$E$	EOC	Index	mADMM	ihADMM	classical ADMM
$\sqrt{2} / 2^{4}$	225	0.0172	-	residual $η$	9.26e-07	9.88e-07	9.98e-07
				CPU times/s	0.31	0.28	0.43
				$#$ iter	22	25	120
$\sqrt{2} / 2^{5}$	961	6.71e-03	1.3580	residual $η$	7.83e-07	9.19e-07	7.25e-07
				CPU times/s	0.85	0.69	0.87
				$#$ iter	23	26	32
$\sqrt{2} / 2^{6}$	3969	2.11e-03	1.6691	residual $η$	8.62e-07	7.10e-07	3.34e-07
				CPU times/s	2.97	3.85	4.45
				$#$ iter	24	30	32
$\sqrt{2} / 2^{7}$	16129	8.02e-04	1.3956	residual $η$	9.80e-08	7.80e-08	8.86e-08
				CPU times/s	14.92	37.39	91.15
				$#$ iter	23	28	78
$\sqrt{2} / 2^{8}$	65025	3.58e-04	1.1636	residual $η$	9.04e-07	9.62e-07	7.05e-07
				CPU times/s	22.80	151.57	2457.02
				$#$ iter	21	28	183
$\sqrt{2} / 2^{9}$	261121	1.81e-04	0.9840	residual $η$	3.09e-07	8.43e-07	5.56e-07
				CPU times/s	168.58	1469.13	40739.35
				$#$ iter	22	30	283

Equations294

(y, u) \in Y \times U min

(y, u) \in Y \times U min

s.t.

y = 0 on \partial Ω,

u \in U_{a d} = {v (x) ∣ a \leq v (x) \leq b, a.e on Ω} \subseteq U,

L y := - i, j = 1 \sum n (a_{ij} y_{x_{i}})_{x_{j}} + c_{0} y,

L y := - i, j = 1 \sum n (a_{ij} y_{x_{i}})_{x_{j}} + c_{0} y,

L y

L y

y

(x, y) \in X \times Y min

(x, y) \in X \times Y min

s.t.

L_{ρ} (x, y, λ; ρ) = f (x) + g (y) + (λ, A x + B y - c) + \frac{ρ}{2} ∥ A x + B y - c ∥^{2},

L_{ρ} (x, y, λ; ρ) = f (x) + g (y) + (λ, A x + B y - c) + \frac{ρ}{2} ∥ A x + B y - c ∥^{2},

⎩ ⎨ ⎧ x^{k + 1} y^{k + 1} λ^{k + 1} = argmin L_{ρ} (x, y^{k}, λ^{k}; ρ), = argmin L_{ρ} (x^{k + 1}, y, λ^{k}; ρ), = λ^{k} + τ ρ (A x^{k + 1} + B y^{k + 1} - c) .

⎩ ⎨ ⎧ x^{k + 1} y^{k + 1} λ^{k + 1} = argmin L_{ρ} (x, y^{k}, λ^{k}; ρ), = argmin L_{ρ} (x^{k + 1}, y, λ^{k}; ρ), = λ^{k} + τ ρ (A x^{k + 1} + B y^{k + 1} - c) .

(u, z) \in U \times Z min

(u, z) \in U \times Z min

s.t.

\delta_{U_{ad}}(z)=\left\{\begin{array}[]{ll}{1,}&{z\in U_{ad}},\\ {\infty,}&{z\notin U_{ad}}.\end{array}\right.

\delta_{U_{ad}}(z)=\left\{\begin{array}[]{ll}{1,}&{z\in U_{ad}},\\ {\infty,}&{z\notin U_{ad}}.\end{array}\right.

L_{σ} (u, z, λ; σ) = \hat{J} (u) + δ_{U_{a d}} (z) + ⟨ λ, u - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - z ∥_{L^{2} (Ω)}^{2},

L_{σ} (u, z, λ; σ) = \hat{J} (u) + δ_{U_{a d}} (z) + ⟨ λ, u - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - z ∥_{L^{2} (Ω)}^{2},

⎩ ⎨ ⎧ \overset{u}{ˉ}^{k + 1} \overset{z}{ˉ}^{k + 1} \overset{ˉ}{λ}^{k + 1} = argmin \hat{J} (u) + ⟨ \overset{ˉ}{λ}^{k}, u - \overset{z}{ˉ}^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - \overset{z}{ˉ}^{k} ∥_{L^{2} (Ω)}^{2}, = argmin δ_{U_{a d}} (z) + ⟨ \overset{ˉ}{λ}^{k}, \overset{u}{ˉ}^{k + 1} - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ \overset{u}{ˉ}^{k + 1} - z ∥_{L^{2} (Ω)}^{2}, = \overset{ˉ}{λ}^{k} + σ (\overset{u}{ˉ}^{k + 1} - \overset{z}{ˉ}^{k + 1}) .

⎩ ⎨ ⎧ \overset{u}{ˉ}^{k + 1} \overset{z}{ˉ}^{k + 1} \overset{ˉ}{λ}^{k + 1} = argmin \hat{J} (u) + ⟨ \overset{ˉ}{λ}^{k}, u - \overset{z}{ˉ}^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - \overset{z}{ˉ}^{k} ∥_{L^{2} (Ω)}^{2}, = argmin δ_{U_{a d}} (z) + ⟨ \overset{ˉ}{λ}^{k}, \overset{u}{ˉ}^{k + 1} - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ \overset{u}{ˉ}^{k + 1} - z ∥_{L^{2} (Ω)}^{2}, = \overset{ˉ}{λ}^{k} + σ (\overset{u}{ˉ}^{k + 1} - \overset{z}{ˉ}^{k + 1}) .

⎩ ⎨ ⎧ Step 1 : Step 2 : Step 3 : Compute an approximation solution u^{k + 1} of u min \hat{J} (u) + ⟨ λ^{k}, u - z^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - z^{k} ∥_{L^{2} (Ω)}^{2} such that the error vector δ_{u}^{k + 1} := \nabla \hat{J} (u^{k + 1}) + λ^{k} + σ (u^{k + 1} - z^{k}) satisfies ∥ δ_{u}^{k + 1} ∥_{L^{2} (Ω)} \leq ϵ_{k} . z^{k + 1} = argmin δ_{U_{a d}} (z) + ⟨ λ^{k}, u^{k + 1} - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u^{k + 1} - z ∥_{L^{2} (Ω)}^{2} . λ^{k + 1} = λ^{k} + τ σ (u^{k + 1} - z^{k + 1}) .

⎩ ⎨ ⎧ Step 1 : Step 2 : Step 3 : Compute an approximation solution u^{k + 1} of u min \hat{J} (u) + ⟨ λ^{k}, u - z^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u - z^{k} ∥_{L^{2} (Ω)}^{2} such that the error vector δ_{u}^{k + 1} := \nabla \hat{J} (u^{k + 1}) + λ^{k} + σ (u^{k + 1} - z^{k}) satisfies ∥ δ_{u}^{k + 1} ∥_{L^{2} (Ω)} \leq ϵ_{k} . z^{k + 1} = argmin δ_{U_{a d}} (z) + ⟨ λ^{k}, u^{k + 1} - z ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u^{k + 1} - z ∥_{L^{2} (Ω)}^{2} . λ^{k + 1} = λ^{k} + τ σ (u^{k + 1} - z^{k + 1}) .

\frac{ρ _{T}}{σ _{T}} \leq κ, \frac{h}{ρ _{T}} \leq τ

\frac{ρ _{T}}{σ _{T}} \leq κ, \frac{h}{ρ _{T}} \leq τ

∣Ω\ Ω_{h} ∣ \leq c h^{2},

∣Ω\ Ω_{h} ∣ \leq c h^{2},

Y_{h} := {y_{h} \in C (\overset{ˉ}{Ω}) ∣ y_{h ∣ T} \in P_{1}, \forall T \in T_{h}, y_{h} = 0 in \overset{ˉ}{Ω} ∖ Ω_{h}},

Y_{h} := {y_{h} \in C (\overset{ˉ}{Ω}) ∣ y_{h ∣ T} \in P_{1}, \forall T \in T_{h}, y_{h} = 0 in \overset{ˉ}{Ω} ∖ Ω_{h}},

U_{h} := {u_{h} \in C (\overset{ˉ}{Ω}) ∣ u_{h ∣ T} \in P_{1}, \forall T \in T_{h}, u_{h} = 0 in \overset{ˉ}{Ω} ∖ Ω_{h}},

\int_{Ω_{h}} (i, j = 1 \sum n a_{ij} (y_{h})_{x_{i}} (v_{h})_{x_{j}} + c_{0} y_{h} v_{h}) d x = \int_{Ω_{h}} (u + y_{r}) v_{h} d x, \forall v_{h} \in Y .

\int_{Ω_{h}} (i, j = 1 \sum n a_{ij} (y_{h})_{x_{i}} (v_{h})_{x_{j}} + c_{0} y_{h} v_{h}) d x = \int_{Ω_{h}} (u + y_{r}) v_{h} d x, \forall v_{h} \in Y .

∥ y - y_{h} ∥_{L^{2} (Ω)} + h ∥\nabla y - \nabla y_{h} ∥_{L^{2} (Ω)} \leq c h^{2} (∥ u ∥_{L^{2} (Ω)} + ∥ y_{r} ∥_{L^{2} (Ω)}) .

∥ y - y_{h} ∥_{L^{2} (Ω)} + h ∥\nabla y - \nabla y_{h} ∥_{L^{2} (Ω)} \leq c h^{2} (∥ u ∥_{L^{2} (Ω)} + ∥ y_{r} ∥_{L^{2} (Ω)}) .

ϕ_{i} (x) ⩾ 0, ∥ ϕ_{i} (x) ∥_{\infty} = 1 \forall i = 1, ..., N_{h}, i = 1 \sum N_{h} ϕ_{i} (x) = 1.

ϕ_{i} (x) ⩾ 0, ∥ ϕ_{i} (x) ∥_{\infty} = 1 \forall i = 1, ..., N_{h}, i = 1 \sum N_{h} ϕ_{i} (x) = 1.

I_{h} w (x) := i = 1 \sum N_{h} w (x_{i}) ϕ_{i} (x) for any w \in L^{2} (Ω) .

I_{h} w (x) := i = 1 \sum N_{h} w (x_{i}) ϕ_{i} (x) for any w \in L^{2} (Ω) .

∥ w - I_{h} w ∥_{W^{m, q} (Ω)} \leq c_{I} h^{k + 1 - m} ∥ w ∥_{W^{k + 1, p} (Ω)},

∥ w - I_{h} w ∥_{W^{m, q} (Ω)} \leq c_{I} h^{k + 1 - m} ∥ w ∥_{W^{k + 1, p} (Ω)},

\hat{J}_{h_{k + 1}} (u_{h_{k + 1}}) := \frac{1}{2} ∥ S_{h_{k + 1}} (u_{h_{k + 1}} + I_{h_{k + 1}} y_{r}) - I_{h_{k + 1}} y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} ∥ u_{h_{k + 1}} ∥_{L^{2} (Ω)}^{2} .

\hat{J}_{h_{k + 1}} (u_{h_{k + 1}}) := \frac{1}{2} ∥ S_{h_{k + 1}} (u_{h_{k + 1}} + I_{h_{k + 1}} y_{r}) - I_{h_{k + 1}} y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} ∥ u_{h_{k + 1}} ∥_{L^{2} (Ω)}^{2} .

U_{a d, h_{k + 1}} := U_{h_{k + 1}} \cap U_{a d} = ⎩ ⎨ ⎧ z_{h_{k + 1}} = i = 1 \sum N_{h_{k + 1}} z_{i} ϕ_{i} (x) ∣ a \leq z_{i} \leq b, \forall i = 1, \dots, N_{h_{k + 1}} ⎭ ⎬ ⎫ \subset U_{a d} .

U_{a d, h_{k + 1}} := U_{h_{k + 1}} \cap U_{a d} = ⎩ ⎨ ⎧ z_{h_{k + 1}} = i = 1 \sum N_{h_{k + 1}} z_{i} ϕ_{i} (x) ∣ a \leq z_{i} \leq b, \forall i = 1, \dots, N_{h_{k + 1}} ⎭ ⎬ ⎫ \subset U_{a d} .

u_{h_{k + 1}} min \hat{J}_{h_{k + 1}} (u_{h_{k + 1}}) + ⟨ λ_{h_{k + 1}}^{k}, u_{h_{k + 1}} - z_{h_{k + 1}}^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u_{h_{k + 1}} - z_{h_{k + 1}}^{k} ∥_{L^{2} (Ω)}^{2}

u_{h_{k + 1}} min \hat{J}_{h_{k + 1}} (u_{h_{k + 1}}) + ⟨ λ_{h_{k + 1}}^{k}, u_{h_{k + 1}} - z_{h_{k + 1}}^{k} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u_{h_{k + 1}} - z_{h_{k + 1}}^{k} ∥_{L^{2} (Ω)}^{2}

z_{h_{k + 1}}^{k + 1} = argmin δ_{U_{a d, h_{k + 1}}} (z_{h_{k + 1}}) + ⟨ λ_{h_{k + 1}}^{k}, u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}} ∥_{L^{2} (Ω)}^{2} .

z_{h_{k + 1}}^{k + 1} = argmin δ_{U_{a d, h_{k + 1}}} (z_{h_{k + 1}}) + ⟨ λ_{h_{k + 1}}^{k}, u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}} ⟩_{L^{2} (Ω)} + \frac{σ}{2} ∥ u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}} ∥_{L^{2} (Ω)}^{2} .

λ_{h_{k + 1}}^{k + 1} = I_{h_{k + 1}} λ_{h_{k}}^{k} + τ σ (u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}}^{k + 1}) .

λ_{h_{k + 1}}^{k + 1} = I_{h_{k + 1}} λ_{h_{k}}^{k} + τ σ (u_{h_{k + 1}}^{k + 1} - z_{h_{k + 1}}^{k + 1}) .

K_{h} := (a (ϕ_{i}, ϕ_{j}))_{i, j = 1}^{N_{h}}, M_{h} := (\int_{Ω_{h}} ϕ_{i} ϕ_{j} d x)_{i, j = 1}^{N_{h}} .

K_{h} := (a (ϕ_{i}, ϕ_{j}))_{i, j = 1}^{N_{h}}, M_{h} := (\int_{Ω_{h}} ϕ_{i} ϕ_{j} d x)_{i, j = 1}^{N_{h}} .

f (u)

f (u)

g (z)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Methods in Computational Mathematics · Matrix Theory and Algorithms · Numerical methods for differential equations

Full text

A multi-level ADMM algorithm for elliptic PDE-constrained optimization problems

Xiaotong Chen School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116025, China. ([email protected], [email protected], [email protected]).

Xiaoliang Song Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China. ([email protected]).

Zixuan Chen11footnotemark: 1

Bo Yu11footnotemark: 1

Abstract

In this paper, the elliptic PDE-constrained optimization problem with box constraints on the control is studied. To numerically solve the problem, we apply the ‘optimize-discretize-optimize’ strategy. Specifically, the alternating direction method of multipliers (ADMM) algorithm is applied in function space first, then the standard piecewise linear finite element approach is employed to discretize the subproblems in each iteration. Finally, some efficient numerical methods are applied to solve the discretized subproblems based on their structures. Motivated by the idea of the multi-level strategy, instead of fixing the mesh size before the computation process, we propose the strategy of gradually refining the grid. Moreover, the subproblems in each iteration are solved inexactly. Based on the strategies above, an efficient convergent multi-level ADMM (mADMM) algorithm is proposed. We present the convergence analysis and the iteration complexity results $o(1/k)$ of the proposed algorithm for the PDE-constrained optimization problems. Numerical results show the high efficiency of the mADMM algorithm.

keywords:

PDE-constrained optimization, ADMM, multi-level, convergence analysis

AMS:

49N05, 49M25, 65N12, 68W15

1 Introduction

In this paper, we consider the elliptic PDE-constrained optimization problem with box constraints on the control:

[TABLE]

where $Y:=H_{0}^{1}(\Omega),U:=L^{2}(\Omega),\Omega\subseteq\mathbb{R}^{n}(n=2,3)$ is a convex, open and bounded domain with $C^{1,1}$ - or polygonal boundary; the desired state $y_{d}\in L^{2}(\Omega)$ and the source term $y_{r}\in L^{2}(\Omega)$ are given; parameters $\alpha>0$ , $-\infty<a<b<+\infty$ ; $L$ is the uniformly elliptic differential operater given by

[TABLE]

where $a_{ij},c_{0}\in L^{\infty}(\Omega),c_{0}\geqslant 0,a_{ij}=a_{ji},\sum\limits_{i,j=1}^{n}a_{ij}(x)\xi_{i}\xi_{j}\geqslant\theta\|\xi\|^{2},\ \rm a.a.\ x\in\Omega,\forall\xi\in\mathbb{R}^{n}.$

By the classical conclusion in the PDE theory, for a given $y_{r}\in L^{2}(\Omega)$ and every $u\in L^{2}(\Omega)$ , elliptic PDEs involved in the ( $\mathrm{P}$ )

[TABLE]

has a unique weak solution $y=y(u):=S(u+y_{r})$ , where $S:L^{2}(\Omega)\rightarrow H_{0}^{1}(\Omega)$ denotes the solution operator. It is well-defined and is a continuous linear injective operator[18, Theorem B.4].

As is known to all, there are two possible approaches to tackle optimization problems with PDE constraints numerically. One is First discretize, then optimize, another is First optimize, then discretize [16]. In the first approach, the discretization is applied to the original PDE-constrained optimization problem, while in the second one, the discretization is applied to the KKT system of the PDE-constrained optimization problem. Different from the strategies mentioned above, instead of applying discretized concept to problem ( $\mathrm{P}$ ) or its KKT system directly, ‘optimize-discretize-optimize’ strategy was proposed by Song in [22]. First the optimization algorithm is given in the sense of continuous function space, then the subproblems in each iteration are discretized by the standard piecewise linear finite element approach. Finally, numerical optimization methods are applied to solve the discretized subproblems numerically. The advantage of this method is that from the optimization algorithm in function space, we can have a better knowledge of the structure of the PDE-constrained optimization problem, which is important for choosing an appropriate discretization format to discretize the subproblems. Moreover, this strategy gives the freedom to discretize the subproblems differently, which makes the subproblems can be solved more effectively.

In this paper, we focus on the alternating direction method of multipliers (ADMM) method, which was originally proposed by Glowinski et al. and Gabay et al. in [9, 11] and has been broadly used in many areas. Motivated by the success of ADMM in solving finite dimensional large scale optimization problem [2, 7, 20, 21], ADMM-type method has been used to solve PDE-constrained optimization problems in function space in [4, 19, 23, 27]. While these works all focus on First discretize, then optimize method, and to the best of our knowledge, very little work has been done to apply First optimize, then discretize to solve PDE-constrained optimization problems by ADMM-type method. We apply ADMM as the outer optimization algorithm of the ‘optimize-discretize-optimize’ strategy, propose a new algorithm to solve PDE-constrained optimization problems efficiently and give the convergence analysis.

First, we briefly give the iterative format of the classical ADMM for the 2-block convex optimization problem with linear constraints:

[TABLE]

where $X$ , $Y$ , $\Lambda$ are finite dimensional real Euclidean spaces, $f(x):X\rightarrow(-\infty,+\infty]$ , $g(y):Y\rightarrow(-\infty,+\infty]$ are convex functions (maybe nonsmooth), $A:X\rightarrow\Lambda$ , $B:Y\rightarrow\Lambda$ are linear mappings, $c\in\Lambda$ is given. The augmented Lagrangian function is given by the following form:

[TABLE]

where $\lambda\in\Lambda$ denotes the Lagrange multiplier, $\rho>0$ is the penalty parameter.

Given $(x^{0},y^{0},\lambda^{0})\in{\rm dom}f\times{\rm dom}g\times\Lambda$ , penalty parameter $\rho>0$ and the step size parameter $\tau\in\left(0,\frac{\sqrt{5}+1}{2}\right)$ , then the iterative format of the classical ADMM is as follows:

[TABLE]

The advantage of ADMM is that it separates $f(x)$ and $g(y)$ into two subproblems. In each subproblem, there is only one variable, the other one is fixed. Thus each subproblem could be solved easily and efficiently. For the classical ADMM, the convergence analysis was first conducted by [8, 9, 10], and for the recent interesting new developments on the convergence analysis of ADMM-type method, see [14, 25, 26].

To apply ADMM-type method to solve the problem ( $\mathrm{P}$ ), we use the solution operator $S$ and introduce an artificial variable $z$ , then we equivalently rewrite problem ( $\mathrm{P}$ ) as the following reduced form:

[TABLE]

where $\hat{J}(u):=\dfrac{1}{2}\|S(u+y_{c})-y_{d}\|_{L^{2}(\Omega)}^{2}+\dfrac{\alpha}{2}\|u\|_{L^{2}(\Omega)}^{2}$ denotes the reduced cost function, $\delta_{U_{ad}}(z)$ denotes the indicator function of $U_{ad}$ ,

[TABLE]

The augmented Lagrangian function of (RP) is defined as follows:

[TABLE]

where $\lambda\in L^{2}(\Omega)$ denotes the Lagrangian multiplier, $\sigma>0$ is the penalty parameter.

The classical ADMM in finite dimensional spaces can be extended directly in Hilbert space for problem (RP). Given $(u^{0},z^{0},{\lambda}^{0})\in L^{2}(\Omega)\times{\rm dom}(\delta_{U_{ad}}(\cdot))\times L^{2}(\Omega)$ , parameters $\sigma>0$ , $\tau\in\left(0,\dfrac{1+\sqrt{5}}{2}\right)$ , the iterative scheme is presented as follows:

[TABLE]

Usually, it is expensive and unnecessary to exactly compute the solution of each subproblem even if it is feasible. Thus it is natural to use some iterative methods such as Krylov-based methods to solve the subproblems which are equivalent to large scale or ill-conditioned linear systems. The inexact ADMM algorithm in finite dimension space and its convergence results under certain error criterion have been studied extensively recently (see [2, 20]). Taking the inexactness of the solutions in the function space into account, Song et al. applied the inexact ADMM algorithm to Hilbert space for PDE-constrained optimization problems and presented the convergence results in [24]. Given $(u^{0},z^{0},{\lambda}^{0})\in L^{2}(\Omega)\times{\rm dom}(\delta_{U_{ad}}(\cdot))\times L^{2}(\Omega)$ , parameters $\sigma>0$ , $\tau\in\left(0,\dfrac{1+\sqrt{5}}{2}\right)$ . Let $\{\epsilon_{k}\}_{k=0}^{\infty}$ be a sequence satisfying $\{\epsilon_{k}\}_{k=0}^{\infty}\subseteq[0,+\infty)$ and $\sum_{k=0}^{\infty}\epsilon_{k}<\infty$ , the iterative scheme of the inexact ADMM in function space for (RP) is as follows:

[TABLE]

In [24], the discretized problem is considered and the level of discretization is fixed. In this paper, instead of considering the discretized problem, the finite element method is employed to discretize the subproblems in each iteration of the inexact ADMM algorithm. This strategy gives the freedom and flexibility to discretize the subproblems by different discretization schemes. The total error of the proposed algorithm for (RP) is consisted of two parts: the discretization error and the iteration error resulted from inexactly solving the discretized subproblems. For these two errors, our algorithm can be considered as an approximation of exact ADMM in function space, thus we regard it as an inexact ADMM algorithm in function space. In order to guarantee the convergence behavior of our algorithm, we consider controlling the mesh size and the inexactness of solving the subproblems.

In the classical finite element based ADMM-type algorithm to solve the PDE constrained optimization problem, the subproblems are always discretized on a fixed mesh size, which results in large scale optimization problems. Thus it is important to consider reducing the computation cost. Multi-grid method is a modern field of research starting with the works of Brandt [1] and Hackbusch [12, 13]. It is well known that multi-grid method solves elliptic problems with optimal computational complexity. Motivated by the idea of applying multi-grid method to tackle infinite dimension problems by Newton method in [6, 12], we apply the multi-level strategy to our algorithm. It is important to point out that in the initial stage of the algorithm, the iteration precision is required to be relatively low, which means using coarse mesh is sufficient. While as the iteration process proceeds, the iteration precision is supposed to be higher and higher. In this case, using finer mesh is necessary. It is obvious that the strategy of gradually refining the grid can strongly reduce the computation cost and make the algorithm faster than computing the problem on a fixed mesh size.

The main contribution of this paper is that we give the reasonable strategies of gradually refining the grid and inexactly solving the subproblems, and propose an efficient convergent multi-level ADMM (mADMM) algorithm. Specifically, we apply the ADMM algorithm in function space, then employ the standard piecewise linear finite element approach and implement the strategy of gradually refining the grid to the related subproblems appearing in each iteration. For the discretized subproblems, we use inexact strategies to solve them. For example, to solve the smooth subproblems, we can use some iterative methods such as Krylov-based methods. Thanks to the above strategies, we can solve the problem more efficiently and reduce the computation cost significantly. Moreover, we give the convergence analysis as well as the iteration complexity results $o(1/k)$ for the mADMM algorithm.

The paper is organized as follows. In section 2, we apply the inexact ADMM algorithm in function space first, then use finite element method to discretize the associated subproblems and propose the multi-level ADMM (mADMM) algorithm. Section 3 gives the convergence analysis and the iteration complexity of the proposed mADMM algorithm. The numerical results are given in section 4. Section 5 contains a brief summary of this paper.

2 A multi-level ADMM algorithm

In this section, we apply the ‘optimize-discretize-optimize’ strategy and propose an efficient convergent multi-level ADMM (mADMM) algorithm in Algorithm 1. The strategies of gradually refining the grid and inexactly solving the subproblems are given to guarantee the convergence and efficiency of the mADMM algorithm.

2.1 The mADMM algorithm

Based on the inexact ADMM in function space, to numerically solve problem (RP), we consider the finite element method. Specifically, piecewise linear functions are utilized to discretize the variables in the related subproblems appearing in each iteration of the inexact ADMM in function space. We first consider a family of regular and quasi-uniform triangulations $\{\mathcal{T}_{h}\}$ of $\bar{\Omega}$ , i.e. $\bar{\Omega}=\bigcup_{T\in\mathcal{T}_{h}}\bar{T}$ . With each element $T\in\mathcal{T}_{h}$ , we define the diameter of the set $T$ by $\rho_{T}:={\rm diam}\ T$ and let $\sigma_{T}$ denotes the diameter of the largest ball contained in $T$ . The mesh size of the grid is defined by $h:={\rm max}_{T\in\mathcal{T}_{h}}\ \rho_{T}$ . We suppose the following standard assumption in the context of error estimates holds (see[15, 16]).

Assumption 2.1.

(Regular and quasi-uniform triangulations) There exist two positive constants $\kappa$ and $\tau$ such that

[TABLE]

hold for all $T\in\mathcal{T}_{h}$ and all $h>0$ . Moreover, let us define $\bar{\Omega}_{h}=\bigcup_{T\in\mathcal{T}_{h}}\bar{T}$ and let $\Omega_{h}\subseteq\Omega$ and $\Gamma_{h}$ denote its interior and its boundary, respectively. In the case that $\Omega$ is a convex polyhedral domain, we have $\Omega=\Omega_{h}$ . In the case that $\Omega$ is a domain with a $C^{1,1}$ - boundary $\Gamma$ , we assume that $\bar{\Omega}_{h}$ is convex and that all boundary vertices of $\bar{\Omega}_{h}$ are contained in $\Gamma$ , such that

[TABLE]

where $|\cdot|$ denotes the measure of the set, and $c>0$ is a constant.

For the state variable $y$ and the control variable $u$ , we choose the same discretized state space and discretized control space defined by

[TABLE]

where $\mathcal{P}_{1}$ denotes the space of polynomials of degree less than or equal to 1. For given source term $y_{r}$ and $u\in L^{2}(\Omega)$ , let the discretized state associated with $u$ denoted by $y_{h}(u)$ , which is defined as the unique solution for the discretized weak formulation of the state equation (1.1):

[TABLE]

Moreover, $y_{h}$ can be expressed by $y_{h}(u)=S_{h}(u+y_{r})$ , where $S_{h}$ denotes the discretized version of the solution operator $S$ . Then we have the following well-known conclusion about error estimates.

Lemma 1.

([5], Thm. 4.4.6)* For a given $u\in L^{2}(\Omega)$ , let $y$ be the unique weak solution of the state equation (1.1) and $y_{h}$ be the unique solution of (2.3). Then there exists a constant $c>0$ independent of $h$ , $u$ and $y_{r}$ such that*

[TABLE]

In particular, this implies $\|S-S_{h}\|_{L^{2}(\Omega)\rightarrow L^{2}(\Omega)}\leq ch^{2}$ and $\|S-S_{h}\|_{L^{2}(\Omega)\rightarrow H^{1}(\Omega)}\leq ch$ .

For the given triangulation $\mathcal{T}_{h}$ with nodes $\{x_{i}\}_{i=1}^{N_{h}}$ , let $\{\phi_{i}(x)\}_{i=1}^{N_{h}}$ be a set of nodal basis functions associated with nodes $\{x_{i}\}_{i=1}^{N_{h}}$ which spans $Y_{h}$ , $U_{h}$ and satisfies the properties:

[TABLE]

Then $y_{h}\in Y_{h}$ , $u_{h}\in U_{h}$ can be represented as $y_{h}=\sum\limits_{i=1}^{N_{h}}y_{i}\phi_{i},$ $u_{h}=\sum\limits_{i=1}^{N_{h}}u_{i}\phi_{i}$ and we have $y_{h}(x_{i})=y_{i}$ , $u_{h}(x_{i})=u_{i}$ . The other variables and operators in the subproblems of the inexact ADMM in function space are all discretized by piecewise linear functions similarly.

Before we give the proposed algorithm to solve the problem (RP), we first introduce the definition of node interpolation operator $I_{h}$ for the following convergence analysis.

Definition 2.

For a given regular and quasi-uniform triangulation $\mathcal{T}_{h}$ of $\Omega$ with nodes $\{x_{i}\}_{i=1}^{N_{h}}$ , let $\{\phi_{i}(x)\}_{i=1}^{N_{h}}$ denotes a set of associated nodal basis functions. Then the interpolation operator is defined as

[TABLE]

Moreover, about the interpolation error estimate, we have the following result.

Lemma 3.

([5], Thm. 3.1.6)* For all $w\in W^{k+1,p}(\Omega),k\geq 0,p,q\in[0,+\infty)$ and $0\leq m\leq k+1$ , we have*

[TABLE]

where $c_{I}$ is a constant which is independent of the mesh size $h$ .

In the classical finite element based algorithm, the discretization mesh size is fixed in advance. When computing on the finer mesh, the scale of the discretized problem becomes larger and the computation cost becomes larger. So it is important to consider reducing the computation cost. In the first several iterations of the algorithm, the iteration precision is not that good, which means using coarse mesh will not make the precision worse, but reduce the computation amount. While as the iteration process proceeds, the iteration precision becomes higher and higher. In this case, using finer mesh is necessary. Thus we introduce the idea of gradually refining the grid. Specifically, in the initial iteration we choose coarse mesh and obtain a solution first, then use the interpolation operator to project the obtained solution to the finer mesh. Finally, we apply appropriate numerical methods to solve the subproblems in the finer mesh and obtain a more precise solution, so on and so forth.

Let the mesh size of the $k$ th iterate denotes by $h_{k},k\in\mathbb{Z},k\geq 1$ , then the discretized reduced cost function is defined as follows,

[TABLE]

Let $U_{ad,h_{k+1}}$ denotes the discretized feasible set,

[TABLE]

Moreover, we define $\lambda_{h_{k+1}}^{k}:=I_{h_{k+1}}\lambda_{h_{k}}^{k}$ and $z_{h_{k+1}}^{k}:=I_{h_{k+1}}z_{h_{k}}^{k}$ , where $I_{h_{k+1}}$ denotes the node interpolation operator. Based on the above representations, we present the iterative scheme of the multi-level ADMM algorithm (mADMM) in Algorithm 1.

Remark 2.1.

In order to guarantee the sequence $\{\xi_{k+1}\}_{k=0}^{\infty}\subseteq[0,+\infty)$ satisfies $\sum_{k=0}^{\infty}\xi_{k+1}<\infty$ . In the numerical experiment, we can choose $\xi_{k+1}=\frac{1}{(k+1)^{2}}$ as an example.

Remark 2.2.

In order to guarantee the mesh sizes $\{h_{k}\}_{k=0}^{\infty}$ of each iteration satisfy $\sum_{k=0}^{\infty}h_{k+1}<\infty.$ In the numerical experiment, we choose $h_{k}=2^{-(k+4)}$ in Example 4.1, $h_{k}=\sqrt{2}/2^{(k+3)}$ in Example 4.2.

2.2 Numerical computation of the subproblems in Algorithm 1

For $u=\sum_{i=1}^{N_{h}}u_{i}\phi_{i}\in U_{h}$ , $y=\sum_{i=1}^{N_{h}}y_{i}\phi_{i}\in Y_{h}$ , let $\boldsymbol{\rm u}=(u_{1},...,u_{N_{h}})$ , $\boldsymbol{\rm y}=(y_{1},...,y_{N_{h}})$ be the relative coefficient vectors respectively. Let the $L^{2}$ -projections of $y_{r}$ and $y_{d}$ onto $Y_{h}$ be $y_{r,h}:=\sum_{i=1}^{N_{h}}y_{r}^{i}\phi_{i}(x)$ and $y_{d,h}:=\sum_{i=1}^{N_{h}}y_{d}^{i}\phi_{i}(x)$ , respectively. Similarly, $\boldsymbol{\rm y}_{r}=(y_{r}^{1},y_{r}^{2},...,y_{r}^{N_{h}})$ and $\boldsymbol{\rm y}_{d}=(y_{d}^{1},y_{d}^{2},...,y_{d}^{N_{h}})$ denote their coefficient vectors. The stiffness matrix and the mass matrix are defined as:

[TABLE]

Furthermore, we define

[TABLE]

Then we can obtain the matrix-vector form of Algorithm 1 in Algorithm 2.

Let $\boldsymbol{\rm y}^{k+1}:=K_{h}^{-1}M_{h}(\boldsymbol{\rm u}^{k+1}+\boldsymbol{\rm y}_{r}),\ \boldsymbol{\rm p}^{k+1}:=K_{h}^{-1}M_{h}(\boldsymbol{\rm y}_{d}-\boldsymbol{\rm y}^{k+1})$ denotes the discretized state and the discretized adjoint state respectively. Then the $\boldsymbol{\rm u}$ -subproblem at the $k$ -th iterate is equivalent to solving the following linear system:

[TABLE]

Since $\boldsymbol{\rm p}^{k+1}=(\alpha+\sigma)\boldsymbol{\rm u}^{k+1}+\boldsymbol{\rm I}_{h_{k+1}}\boldsymbol{\rm\lambda}^{k}-\sigma\boldsymbol{\rm I}_{h_{k+1}}\boldsymbol{\rm z}^{k}$ , we eliminate the variable $\boldsymbol{\rm p}$ , then the above problem can be rewritten in the following reduced form without any computational cost:

[TABLE]

The equivalent linear system (2.11) can be solved inexactly by some Krylov-based method.

Finally, we give a terminal condition of Algorithm 2. Let $\epsilon$ be a given accuracy tolerance, we terminate the algorithm when $\eta<\epsilon$ , where the accuracy of a numerical solution is measured by the residual $\eta:={\rm max}\{\eta_{1},\eta_{2},\eta_{3},\eta_{4},\eta_{5}\},$ where

[TABLE]

3 Convergence analysis

In this section, based on the convergence results of the inexact ADMM in function space in Theorem 4, we give the convergence analysis and the iteration complexity of the proposed mADMM algorithm.

Theorem 4.

([24], Thm. 2.5)*

Suppose that the operator $L$ is uniformly elliptic. Let $(y^{\ast},u^{\ast},z^{\ast},p^{\ast},\lambda^{\ast})$ be the KKT point of (P), the sequence $\{(u^{k},z^{k},\lambda^{k})\}$ is generated by the inexact ADMM algorithm for (RP) with the associated state $\{y_{k}\}$ and adjoint state $\{p_{k}\}$ , then we have*

[TABLE]

Moreover, there exists a constant $C$ only depends on the initial point $(u^{0},z^{0},\lambda^{0})$ and the optimal solution $(u^{\ast},z^{\ast},\lambda^{\ast})$ such that for $k\geq 1$ ,

[TABLE]

where the function $R:(u,z,\lambda)\rightarrow[0,\infty)$ defined as

[TABLE]

For the convenience of proving the convergence results, we give the iterative scheme of the multi-level discretized ADMM for (RP) and present a lemma to measure the gap between the solution sequences obtained by the ADMM in function space and the multi-level discretized ADMM.

Given $(u^{0},z^{0},{\lambda}^{0})\in L^{2}(\Omega)\times{\rm dom}(\delta_{U_{ad}}(\cdot))\times L^{2}(\Omega)$ , parameters $\sigma>0$ , $\tau\in\left(0,\dfrac{1+\sqrt{5}}{2}\right)$ . The mesh sizes $\{h_{k}\}_{k=0}^{\infty}$ of each iteration satisfy $\sum_{k=0}^{\infty}h_{k+1}<\infty.$ Then the iterative format of the multi-level discretized ADMM algorithm is as follows:

[TABLE]

Lemma 5.

Let the initial point be $(u^{0},z^{0};\lambda^{0})\in L^{2}(\Omega)\times{\rm dom}(\delta_{U_{ad}}(\cdot))\times L^{2}(\Omega)$ , then $\|\bar{u}^{k+1}-\bar{u}^{k+1}_{h_{k+1}}\|_{L^{2}(\Omega)}=\|\bar{z}^{k+1}-\bar{z}^{k+1}_{h_{k+1}}\|_{L^{2}(\Omega)}=\|\bar{\lambda}^{k}-\bar{\lambda}^{k}_{h_{k+1}}\|_{L^{2}(\Omega)}=O(h_{k+1})$ , $\forall k\geqslant 1,$ and

[TABLE]

Proof.

We employ the mathematical induction to prove the conclusion. While $k=1$ , with the definition of $\lambda_{h_{k+1}}^{k}:=I_{h_{k+1}}\lambda_{h_{k}}^{k}$ and the interpolation error estimate in Lemma 3, we have

[TABLE]

Then we can easily obtain that $\|\bar{u}^{1}-\bar{u}^{1}_{h_{1}}\|_{L^{2}(\Omega)}=O(h_{1})$ . The proof is similar to the case $k>1$ , here we omit it.

While $k>1$ , we assume for $\forall j\leq k,$ we have $\|\bar{u}^{j}-\bar{u}^{j}_{h_{j}}\|_{L^{2}(\Omega)}=\|\bar{\lambda}^{j-1}-\bar{\lambda}^{j-1}_{h_{j}}\|_{L^{2}(\Omega)}=O(h_{j})$ . Then for $z$ -subproblems in exact ADMM in function space and the multi-level discretized ADMM, we know $\bar{z}^{k}$ and $\bar{z}^{k}_{h_{k}}$ satisfy the following optimality conditions:

[TABLE]

Then subtracting the above two equalities we obtain

[TABLE]

For the multiplier $\bar{\lambda}^{k}$ and $\bar{\lambda}^{k}_{h_{k}}$ ,

[TABLE]

we can get the estimate

[TABLE]

For $u$ -subproblems, $\bar{u}^{k+1}$ and $\bar{u}^{k+1}_{h_{k+1}}$ satisfy the following optimality conditions respectively,

[TABLE]

Then we know from the above two equalities that

[TABLE]

so

[TABLE]

where we define

[TABLE]

For the term $E_{1}$ , we make use of the decomposition,

[TABLE]

From the well known error estimate $\|S-S_{h}\|_{L^{2}(\Omega)\rightarrow L^{2}(\Omega)}=O(h^{2})$ in lemma 1 and the property that $S^{\ast},S_{h_{k+1}}$ are bounded linear operators, we have

[TABLE]

Hence, there exists a constant $\hat{C}$ such that

[TABLE]

Similarly, based on the property of the projection operator $\|y_{r}-I_{h_{k+1}}y_{r}\|_{L^{2}(\Omega)}=\|y_{d}-I_{h_{k+1}}y_{d}\|_{L^{2}(\Omega)}=O(h_{k+1})$ in Lemma 3, for the term $E_{2}$ , we have

[TABLE]

and

[TABLE]

For the term $E_{4}$ ,

[TABLE]

where we used (3.7), the property of the projection operator $\|\bar{\lambda}^{k}_{h_{k}}-I_{h_{k+1}}\bar{\lambda}^{k}_{h_{k}}\|_{L^{2}(\Omega)}=O(h_{k+1})$ and the property of the mesh size $h_{k}>h_{k+1}$ . Moreover, as the mesh sizes satisfy $\sum_{k=0}^{\infty}h_{k+1}<\infty,$ there exists a constant $C_{k+1}$ such that $h_{k}<C_{k+1}h_{k+1}$ , then we have

[TABLE]

Similarly,

[TABLE]

Then with the fact that operators $S^{\ast}_{h_{k+1}},S_{h_{k+1}}$ are bounded linear operators, we know from the equality (3.11) and the estimations of $L^{2}$ norms of $\{E_{i}\}_{i=1}^{5}$ above that

[TABLE]

Moreover, we have

[TABLE]

Hence the conclusion holds for the case $k+1$ and we can get the assertion

[TABLE]

∎

Similar to the Lemma 5, we have the following lemma.

Lemma 6.

Let the initial point be $(u^{0},z^{0};\lambda^{0})\in L^{2}(\Omega)\times{\rm dom}(\delta_{U_{ad}}(\cdot))\times L^{2}(\Omega)$ , then $\|u^{k+1}-{u}^{k+1}_{h_{k+1}}\|_{L^{2}(\Omega)}=\|z^{k+1}-{z}^{k+1}_{h_{k+1}}\|_{L^{2}(\Omega)}=\|{\lambda}^{k}-{\lambda}^{k}_{h_{k+1}}\|_{L^{2}(\Omega)}=O(h_{k+1}+\delta_{u,h_{k+1}}^{k+1})$ , $\forall k\geqslant 1,$ and

[TABLE]

Proof.

We employ the mathematical induction to prove the conclusion. The proof is similar to Lemma 5, here we do not talk about it in detail. ∎

To prove the convergence of the mADMM algorithm, let $(\tilde{u}_{h_{k+1}}^{k+1},\tilde{z}^{k+1}_{h_{k+1}})$ represents the exact solutions of the $(k+1)$ th iteration of Algorithm 1:

[TABLE]

The following lemma gives the gap between $(\tilde{u}_{h_{k+1}}^{k+1},\tilde{z}_{h_{k+1}}^{k+1})$ and $({u}_{h_{k+1}}^{k+1},{z}_{h_{k+1}}^{k+1})$ .

Lemma 7.

([24], Lemma 4.4)

For any $k\geqslant 0$ , we have

[TABLE]

where $\rho:=\|[S^{\ast}_{h_{k+1}}S_{h_{k+1}}+(\alpha+\sigma)I]^{-1}\|_{{L^{2}(\Omega)}\rightarrow{L^{2}(\Omega)}}$ .

For the convenience of analyzing the non-ergodic iteration complexity, let $(u^{\ast}_{h_{k+1}},z^{\ast}_{h_{k+1}},\lambda^{\ast}_{h_{k+1}})$ denotes the KKT point of the discretized reduced problem with the mesh size $h_{k+1}$

[TABLE]

Moreover, we provide a lemma and two propositions which are essential for analyzing the iteration complexity of our mADMM.

Lemma 8.

([2], Lemma 6.1)

If a sequence $\left\{a_{i}\right\}\in\mathbb{R}$ satisfies the following conditions:

[TABLE]

Then we have $\min_{i=1,2,\cdots,k}\left\{a_{i}\right\}\leq\frac{\overline{a}}{k}$ and $\lim_{k\rightarrow\infty}\left\{k\cdot\min_{i=1,2,\cdots,k}\left\{a_{i}\right\}\right\}=0$ .

Proposition 9.

Let $\left\{\left(u^{k}_{h_{k}},z^{k}_{h_{k}},\lambda^{k}_{h_{k}}\right)\right\}$ be the sequence generated by Algorithm 1 and $\left\{\left(u^{\ast}_{h_{k}},z^{\ast}_{h_{k}},\lambda^{\ast}_{h_{k}}\right)\right\}$ denotes the KKT point of the discretized reduced problem. Then for $k\geq 0$ we have

[TABLE]

Proof.

First, for any $f_{1},f_{1}^{\prime},f_{2},f_{2}^{\prime}\in L^{2}(\Omega)$ , we have the following two important equalities hold

[TABLE]

The proof of the above two equalities can be easily obtained by the definition of $L^{2}-$ norm.

By the optimality conditions of the $u$ -subproblem and $z$ -subproblem corresponding to $u_{h_{k+1}}^{k+1}$ and $z_{h_{k+1}}^{k+1}$ , we have

[TABLE]

Moreover, $\left\{\left(u^{\ast}_{h_{k}},z^{\ast}_{h_{k}},\lambda^{\ast}_{h_{k}}\right)\right\}$ denotes the KKT point of the discretized reduced problem, so it satisfies the following KKT system

[TABLE]

Then by combining (3.29) and (3.31), we obtain

[TABLE]

Moreover, the subdifferential operator $\partial\delta_{U_{ad,h_{k+1}}}(z)$ is a maximal monotone operator, so the following inequality holds,

[TABLE]

For the convenience of analyzing, we define $r_{h_{k+1}}^{k+1}=u_{h_{k+1}}^{k+1}-z_{h_{k+1}}^{k+1}$ . Therefore, we can derive that

[TABLE]

Then adding the above two equalities we obtain

[TABLE]

Next, we estimate the last two terms on the left side separately,

[TABLE]

where we used the equality (3.27).

Moreover, by employing the equality (3.28) and using $u^{\ast}_{h_{k+1}}=z^{\ast}_{h_{k+1}}$ , we have

[TABLE]

Then, substituting (LABEL:estimate1), (LABEL:estimate2) into (3.38), we can get the assertion of Proposition 9. ∎

Proposition 10.

Let $\left\{\left(u^{k}_{h_{k}},z^{k}_{h_{k}},\lambda^{k}_{h_{k}}\right)\right\}$ be the sequence generated by Algorithm 1, $\left\{\left(u^{\ast}_{h_{k}},z^{\ast}_{h_{k}},\lambda^{\ast}_{h_{k}}\right)\right\}$ denotes the KKT point of the discretized reduced problem and $\tilde{u}_{h_{k+1}}^{k+1}$ , $\tilde{z}^{k+1}_{h_{k+1}}$ defined in (3.24), (3.25), respectively. Then for $k\geq 0$ we have

[TABLE]

Proof.

For the proof of Proposition 10, by substituting $\tilde{u}^{k+1}_{h_{k+1}}$ and $\tilde{z}^{k+1}_{h_{k+1}}$ for $u^{k+1}_{h_{k+1}}$ and $z^{k+1}_{h_{k+1}}$ in the proof of Proposition 9, we can get the assertion. ∎

Finally, based on the above results, the convergence results of Algorithm 1 is given by the following theorem.

Theorem 11.

Suppose that the operator $L$ is uniformly elliptic. Let $(y^{\ast},u^{\ast},z^{\ast},p^{\ast},\lambda^{\ast})$ be the KKT point of ( $\mathrm{P}$ ), $(u^{k}_{h_{k}},z^{k}_{h_{k}},\lambda^{k}_{h_{k}})$ is obtained in the $k$ th iterate of Algorithm 1, where we suppose the mesh sizes $\{h_{k}\}_{k=0}^{\infty}$ of each iteration satisfy $\sum_{k=0}^{\infty}h_{k+1}<\infty,$ and the error vector $\delta_{u,h_{k+1}}^{k+1}$ satisfies $\|\delta_{u,h_{k+1}}^{k+1}\|_{L^{2}(\Omega)}\leq\xi_{k+1}$ , $\sum_{k=0}^{\infty}\xi_{k+1}<\infty.$ Then we have

[TABLE]

Moreover, there exists a constant $\tilde{C}$ only depends on the initial point $(u^{0},z^{0},\lambda^{0})$ and the optimal solution $(u^{\ast},z^{\ast},\lambda^{\ast})$ such that for $k\geq 1$ ,

[TABLE]

where $R_{h_{i}}:(u^{i}_{h_{i}},z^{i}_{h_{i}},\lambda^{i}_{h_{i}})\rightarrow[0,\infty)$ is defined as

[TABLE]

Proof.

By the optimality condition of the $u$ -subproblem in the ADMM in function space, we have

[TABLE]

As we know, the error between the inexact solution and exact solution contains two parts, error from gradually refining the grid and error from the inexactly solving the subproblems. We take them into consideration together as a total error, let $u_{h_{k+1}}^{k+1}$ represents the inexact solution of the $(k+1)$ th iteration, then from the optimality condition of the $u$ -subproblem, we have

[TABLE]

Moreover, by the optimality conditions of the $u-$ subproblem corresponding to $u_{h_{k+1}}^{k+1}$ and $\bar{u}_{h_{k+1}}^{k+1}$ in Algorithm 1 and multi-level discretized ADMM, we have

[TABLE]

Then we know from the four equalities above that

[TABLE]

Moreover, we have the estimate

[TABLE]

Then we know from (LABEL:equ:estimate), Lemma 5 and Lemma 6 that there exists constant C such that

[TABLE]

We know from Algorithm 1 that the mesh sizes $\{h_{k}\}_{k=0}^{\infty}$ of each iteration satisfy $\sum_{k=0}^{\infty}h_{k+1}<\infty.$ The error vector of the multi-level ADMM satisfy $\sum_{k=0}^{\infty}\|\delta_{u,h_{k+1}}^{k+1}\|_{L^{2}(\Omega)}\leq\sum_{k=0}^{\infty}\xi_{k+1}<\infty,$ where $\xi_{k+1}$ is the upper bound of $\delta_{u,h_{k+1}}^{k+1}$ , i.e. $\|\delta_{u,h_{k+1}}^{k+1}\|_{L^{2}(\Omega)}\leq\xi_{k+1}$ . Thus we have

[TABLE]

For the discretization error and the iteration error, Algorithm 1 can be considered as an inexact ADMM algorithm in function space, then we know from Theorem 4 that the convergence of Algorithm 1 is guaranteed.

At last, we establish the proof of the iteration complexity results for the sequence generated by the mADMM. First, by the optimality condition for $(u^{k+1}_{h_{k+1}},z^{k+1}_{h_{k+1}})$ , we have

[TABLE]

Then by the definition of $R_{h_{k}}$ , we derive

[TABLE]

Next, for the convenience of giving an upper bound of $R_{h_{k+1}}(u^{k+1}_{h_{k+1}},z^{k+1}_{h_{k+1}},\lambda^{k+1}_{h_{k+1}})$ , we define the following sequence $\theta_{k}$ , $\bar{\theta}_{k}$ and $\tilde{\theta}_{k}$ with:

[TABLE]

First, we give an upper bound of $\theta^{k}$ . We know from Lemma 10 that $\|\tilde{\theta}^{k+1}\|_{L^{2}(\Omega)}\leq\|\theta^{k}\|_{L^{2}(\Omega)},$ so

[TABLE]

We know from the definition of $\tilde{\theta}^{k+1}$ , $\theta^{k+1}$ and Lemma 7 that

[TABLE]

where $\tilde{C}_{1}$ is a constant. So there exists a constant $C^{\prime}_{1}$ such that for every $k$

[TABLE]

By Lemma 9, we have

[TABLE]

so there exists a constant $C^{\prime}_{2}$ such that $\|\bar{\theta}^{k+1}\|_{L^{2}(\Omega)}\leq C^{\prime}_{2}.$ Hence,

[TABLE]

By the definition of $\theta_{k}$ and $\bar{\theta}_{k}$ , we have

[TABLE]

thus $\|\theta^{k+1}-\bar{\theta}^{k+1}\|_{L^{2}(\Omega)}=O(h_{k+2}).$

Moreover, we have the estimate that

[TABLE]

where $\bar{\eta}:=\sqrt{\frac{2C_{1}^{\prime}}{\tau\sigma}}+\sqrt{\left(1+\frac{1}{\tau}\right)\frac{2C_{2}^{\prime}}{\sigma}}$ is a constant, we used (3.56), (3.57) and the property $u^{\ast}_{h_{k+1}}=z^{\ast}_{h_{k+1}}$ .

Then we know from Lemma 9 that

[TABLE]

where we used the property $\left\|\theta^{k+1}\right\|^{2}_{L^{2}(\Omega)}-\left\|\bar{\theta}^{k+1}\right\|^{2}_{L^{2}(\Omega)}\leq\|\bar{\theta}^{k+1}+\theta^{k+1}\|_{L^{2}(\Omega)}\cdot\|\bar{\theta}^{k+1}-\theta^{k+1}\|_{L^{2}(\Omega)}$ and (3.55). Hence, there exist constants $\tilde{C}_{1},\tilde{C}_{2}$ such that

[TABLE]

Finally, by substituting (3.62) to (3.50), there exists a constant $\tilde{C}$ ,

[TABLE]

Thus, by Lemma 8, we know that

[TABLE]

holds. Therefore, combining the obtained global convergence results, we complete the whole proof of Theorem 11. ∎

4 Numerical experiments

In this section we illustrate the numerical performance of the proposed multi-level ADMM algorithm for PDE-constrained optimization problems. All our computational results are obtained by MATLAB R2017b running on a computer with 64-bit Windows 7.0 operation system, Intel(R) Core(TM) i7-6700U CPU (3.40 GHz), and 32 GB of memory.

First, we introduce the algorithmic details that are common to all examples. The discretization was carried out by using the standard piecewise linear finite element approach. To present numerical results, it is convenient to introduce the experimental order of convergence (EOC), which for some positive error function $E(h):=\|u-u_{h}\|_{L^{2}(\Omega)},h>0$ is defined by

[TABLE]

We note that if $E(h)=O(h^{\beta})$ , then $EOC\approx\beta$ . In numerical experiments, we measure the accuracy of an approximate optimal solution by using the corresponding KKT residual error for each algorithm. For the purpose of showing the efficiency of our mADMM, we report the numerical results obtained by running the ihADMM (see [24] for details) and the classical ADMM method to compare with the results obtained by the mADMM. In this case, we terminate all the algorithms when $\eta<10^{-6}$ with the maximum number of iterations set to 500. For all numerical examples and all algorithms, we choose zeros as the initial values and the penalty parameter $\sigma$ was chosen as $\sigma=0.1\alpha$ . About the step length $\tau$ , we choose $\tau=1.618$ .

Example 4.1.

([16], Example 3.3) Consider

[TABLE]

where the domain is the unit circle $\Omega=B_{1}(0)\subseteq\mathbb{R}^{2}$ . Set the desired state $y_{d}=(1-(x_{1}^{2}+x_{2}^{2}))x_{1}$ , the parameters $\alpha=0.1,a=-0.2,b=0.2$ .

In this example, the exact solutions of the problem are unknown in advance. Instead we use the numerical solutions computed on the grid with $h=2^{-10}$ as reference solutions. As an example, the discretized optimal control on the grid with $h=2^{-7}$ is presented in Figure 1.

The error of the control $u$ w.r.t. the $L^{2}$ -norm, the EOC for the control, the numerical results for the accuracy of solution, the CPU time and the number of iterations obtained by our mADMM, the ihADMM and the classical ADMM are shown in Table 1. We can see from Table 1 that our mADMM is highly efficient in obtaining an approximate solution compared with the ihADMM and the classical ADMM in terms of the CPU time, especially when the discretization is in a fine level. Furthermore, it should be specially mentioned that the numerical results in terms of iterations illustrate the mesh-independent performance of the mADMM and the ihADMM. However, iterations of the classical ADMM will increase with the refinement of the discretization.

Example 4.2.

([17], Example 4.1) Consider

[TABLE]

where $\Omega=(0,1)^{2}$ , the upper bound is $a=0.3$ , the lower bound is $b=1$ , and the regularization parameter is $\alpha=0.001$ . We choose $y_{d}=-4\pi^{2}\alpha\sin(\pi x)\sin(\pi y)+Sr$ , where $r=\min(1,\max(0.3,2\sin(\pi x)\sin(\pi y)))$ , $S$ denotes the solution operator. In addition, from the choice of parameters, it is easy to know that $u=r$ is the unique solution of the continuous problem.

The exact control and the discretized optimal control on the grid with $h=\sqrt{2}/2^{7}$ are presented in Figure 2.

The error of the control $u$ w.r.t. the $L^{2}$ - norm, the EOC for the control, the numerical results for the accuracy of solution, the CPU time and the number of iterations obtained by our mADMM, the ihADMM and the classical ADMM are shown in Table 2. Experiment results show that our mADMM has evident advantage on CPU time over the ihADMM and the classical ADMM. Furthermore, we also notice that the numerical results in terms of iteration numbers illustrate the mesh-independent performance of the mADMM.

5 Conclusion

In this paper, we employ a multi-level ADMM algorithm to solve optimization problems with PDE constraints. Instead of solving the discretized problems, we apply the ‘optimize-discretize-optimize’ strategy. Such approach has the flexibility that allows us to discretize the subproblems of the inexact ADMM algorithm by different discretization schemes. Motivated by the multi-level strategy, we propose the proper strategy of gradually refining the grid and the strategy of solving the subproblems inexactly. We designed the convergent multi-level ADMM (mADMM) algorithm, which can significantly reduce the computation cost and make the algorithm faster. The convergence analysis and the iteration complexity results $o(1/k)$ is presented. Numerical results demonstrated the efficiency of the proposed mADMM algorithm.

Acknowledgements

We would like to thank Prof. Long Chen very much for his FEM package iFEM [3] in Matlab.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Brandt, Multi-level adaptive solutions to boundary-value problems. Math. Comp. 31 (1977) 333-390.
2[2] L. Chen, D.F. Sun and K.C. Toh, An efficient inexact symmetric Gauss-Seidel based majorized ADMM for high-dimensional convex composite conic programming. Math. Program. 161 (2017) 237-270.
3[3] L. Chen, i FEM: An Integrated Finite Element Methods Package in MATLAB. Technical Report. University of California at Irvine, Irvine (2009).
4[4] Z.X. Chen, X.L. Song, X.P. Zhang and B. Yu, A FE-ADMM algorithm for Lavrentiev-regularized state-constrained elliptic control problem. ESAIM: COCV 25 (2019) E 5.
5[5] P.G. Ciarlet, The Finite Element Method for Elliptic Problems. Volume 40 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics, Philadelphia (2002).
6[6] P. Deuflhard, Newton Methods for Nonlinear Problems: Affine Invariance and Adaptive Algorithms. Springer, Berlin (2011).
7[7] M. Fazel, T.K. Pong, D.F. Sun and P. Tseng, Hankel matrix rank minimization with applications to system identification and realization. SIAM J. Matrix Anal. Appl. 34 (2013) 946-977.
8[8] M. Fortin and R. Glowinski, On decomposition-coordination methods using an augmented Lagrangian. Augmented Lagrangian Methods: Applications to the Solution of Boundary Problems. Elsevier, Amsterdam (1983).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A multi-level ADMM algorithm for elliptic PDE-constrained optimization problems

Abstract

keywords:

AMS:

1 Introduction

2 A multi-level ADMM algorithm

2.1 The mADMM algorithm

Assumption 2.1**.**

Lemma 1**.**

Definition 2**.**

Lemma 3**.**

Remark 2.1**.**

Remark 2.2**.**

2.2 Numerical computation of the subproblems in Algorithm 1

3 Convergence analysis

Theorem 4**.**

Lemma 5**.**

Proof.

Lemma 6**.**

Proof.

Lemma 7**.**

Lemma 8**.**

Proposition 9**.**

Proof.

Proposition 10**.**

Proof.

Theorem 11**.**

Proof.

4 Numerical experiments

Example 4.1**.**

Example 4.2**.**

5 Conclusion

Acknowledgements

Assumption 2.1.

Lemma 1.

Definition 2.

Lemma 3.

Remark 2.1.

Remark 2.2.

Theorem 4.

Lemma 5.

Lemma 6.

Lemma 7.

Lemma 8.

Proposition 9.

Proposition 10.

Theorem 11.

Example 4.1.

Example 4.2.