Quasi-best approximation in optimization with PDE constraints

Fernando Gaspoz; Christian Kreuzer; Andreas Veeser and; Winnifried Wollner

arXiv:1904.07049·math.NA·October 3, 2019

Quasi-best approximation in optimization with PDE constraints

Fernando Gaspoz, Christian Kreuzer, Andreas Veeser and, Winnifried Wollner

PDF

TL;DR

This paper establishes quasi-best approximation bounds for finite element solutions to PDE-constrained quadratic optimization problems, linking error bounds to best approximation errors and analyzing parameter dependencies.

Contribution

It introduces a quasi-best approximation result for PDE-constrained optimization, including bounds that are independent of regularization parameters under certain conditions.

Findings

01

Error in state and adjoint state bounded by best approximation error

02

Constant depends on inverse square-root of Tikhonov parameter

03

Independence of approximation constant when operators are compact

Abstract

We consider finite element solutions to quadratic optimization problems, where the state depends on the control via a well-posed linear partial differential equation. Exploiting the structure of a suitably reduced optimality system, we prove that the combined error in the state and adjoint state of the variational discretization is bounded by the best approximation error in the underlying discrete spaces. The constant in this bound depends on the inverse square-root of the Tikhonov regularization parameter. Furthermore, if the operators of control-action and observation are compact, this quasi-best-approximation constant becomes independent of the Tikhonov parameter as the meshsize tends to $0$ and we give quantitative relationships between meshsize and Tikhonov parameter ensuring this independence. We also derive generalizations of these results when the control variable is discretized…

Equations363

(q, u) \in L^{2} \times H_{0}^{1} min \frac{1}{2} ∣ u - u_{d} ∣_{0}^{2} + \frac{α}{2} ∣ q ∣_{0}^{2} subject to - Δ u = q

(q, u) \in L^{2} \times H_{0}^{1} min \frac{1}{2} ∣ u - u_{d} ∣_{0}^{2} + \frac{α}{2} ∣ q ∣_{0}^{2} subject to - Δ u = q

∣ u ∣_{0} + α ∣ q ∣_{0} .

∣ u ∣_{0} + α ∣ q ∣_{0} .

∥ x ∥^{2} := ∣ u ∣_{1}^{2} + \frac{1}{α} ∣ p ∣_{1}^{2}, x = (u, z),

∥ x ∥^{2} := ∣ u ∣_{1}^{2} + \frac{1}{α} ∣ p ∣_{1}^{2}, x = (u, z),

∥ x - x_{h} ∥ \leq ν_{h} v_{h} \in V_{h} \times V_{h} in f ∥ x - v_{h} ∥

∥ x - x_{h} ∥ \leq ν_{h} v_{h} \in V_{h} \times V_{h} in f ∥ x - v_{h} ∥

ν_{h} \leq κ_{α} := 2 (1 + C_{F} (1 + \frac{2 C _{F}}{α})) and ∣ ν_{h} - 1∣ \leq C_{I} κ_{α} h as h \to 0.

ν_{h} \leq κ_{α} := 2 (1 + C_{F} (1 + \frac{2 C _{F}}{α})) and ∣ ν_{h} - 1∣ \leq C_{I} κ_{α} h as h \to 0.

A u = C q

A u = C q

M_{a} := ∥ v_{1} ∥_{1} = 1, ∥ v_{2} ∥_{2} = 1 sup a (v_{1}, v_{2}) < \infty,

M_{a} := ∥ v_{1} ∥_{1} = 1, ∥ v_{2} ∥_{2} = 1 sup a (v_{1}, v_{2}) < \infty,

\displaystyle\forall v_{1}\in V_{1}\quad\Big{(}\forall v_{2}\in V_{1}\;a(v_{1},v_{2})=0\Big{)}\implies v_{1}=0.

m_{a} := ∥ v_{2} ∥_{2} = 1 inf p ∥ v_{1} ∥_{1} = 1 sup a (v_{1}, v_{2}) > 0,

(q, u) \in Q \times V_{1} min \frac{1}{2} ∥ I u - u_{d} ∥_{W}^{2} + \frac{α}{2} ∥ q ∥_{Q}^{2} subject to A u = C q

(q, u) \in Q \times V_{1} min \frac{1}{2} ∥ I u - u_{d} ∥_{W}^{2} + \frac{α}{2} ∥ q ∥_{Q}^{2} subject to A u = C q

(u, q) \mapsto (∥ I u ∥_{W}^{2} + α ∥ q ∥_{Q}^{2})^{1/2}

(u, q) \mapsto (∥ I u ∥_{W}^{2} + α ∥ q ∥_{Q}^{2})^{1/2}

A^{*} v_{2} = a (\cdot, v_{2}), (q, C^{*} v_{2})_{Q} = ⟨ C q, v_{2} ⟩, ⟨ I^{*} w, v_{1} ⟩_{1} = (I v_{1}, w)_{W}

A^{*} v_{2} = a (\cdot, v_{2}), (q, C^{*} v_{2})_{Q} = ⟨ C q, v_{2} ⟩, ⟨ I^{*} w, v_{1} ⟩_{1} = (I v_{1}, w)_{W}

A u = C q, A^{*} p = I^{*} (I u - u_{d}), α q = - C^{*} p .

A u = C q, A^{*} p = I^{*} (I u - u_{d}), α q = - C^{*} p .

(- β I^{*} I A β A^{*} \frac{1}{α} C C^{*}) (u p) = (- β I^{*} u_{d} 0) .

(- β I^{*} I A β A^{*} \frac{1}{α} C C^{*}) (u p) = (- β I^{*} u_{d} 0) .

A u = C q, A^{*} z = \frac{1}{α} I^{*} (I u - u_{d}), α q = - C^{*} z

A u = C q, A^{*} z = \frac{1}{α} I^{*} (I u - u_{d}), α q = - C^{*} z

(- \frac{1}{α} I^{*} I A A^{*} \frac{1}{α} C C^{*}) (u z) = (- \frac{1}{α} I^{*} u_{d} 0) .

(- \frac{1}{α} I^{*} I A A^{*} \frac{1}{α} C C^{*}) (u z) = (- \frac{1}{α} I^{*} u_{d} 0) .

\forall φ_{1} \in V_{1}

\forall φ_{1} \in V_{1}

\forall φ_{2} \in V_{2}

V := V_{1} \times V_{2} with ∥ v ∥ := (∥ v_{1} ∥_{1}^{2} + ∥ v_{2} ∥_{2}^{2})^{1/2}, v = (v_{1}, v_{2}) \in V,

V := V_{1} \times V_{2} with ∥ v ∥ := (∥ v_{1} ∥_{1}^{2} + ∥ v_{2} ∥_{2}^{2})^{1/2}, v = (v_{1}, v_{2}) \in V,

b (v, φ) := a (v, φ) + \frac{1}{α} c (v, φ)

b (v, φ) := a (v, φ) + \frac{1}{α} c (v, φ)

a (v, φ)

c (v, φ)

find x \in V such that \forall φ \in V b (x, φ) = - \frac{1}{α} (u_{d}, I φ_{1})_{W} .

find x \in V such that \forall φ \in V b (x, φ) = - \frac{1}{α} (u_{d}, I φ_{1})_{W} .

a_{∣ V \times V}, c, and so b are symmetric,

a_{∣ V \times V}, c, and so b are symmetric,

∥ I u ∥_{W}^{2} + ∥ C^{*} z ∥_{Q}^{2} = ∥ I u ∥_{W}^{2} + α ∥ q ∥_{Q}^{2},

∥ I u ∥_{W}^{2} + ∥ C^{*} z ∥_{Q}^{2} = ∥ I u ∥_{W}^{2} + α ∥ q ∥_{Q}^{2},

∣ v ∣ := (∥ I v_{1} ∥_{W}^{2} + ∥ C^{*} v_{2} ∥_{Q}^{2})^{1/2}

∣ v ∣ := (∥ I v_{1} ∥_{W}^{2} + ∥ C^{*} v_{2} ∥_{Q}^{2})^{1/2}

∣ v ∣ = 1, ∣ φ ∣ = 1 sup ∣ c (v, φ) ∣ = 1 = ∣ v ∣ = 1 inf p ∣ φ ∣ = 1 sup c (v, φ),

∣ v ∣ = 1, ∣ φ ∣ = 1 sup ∣ c (v, φ) ∣ = 1 = ∣ v ∣ = 1 inf p ∣ φ ∣ = 1 sup c (v, φ),

c\big{(}(v_{1},v_{2}),(-v_{1},v_{2})\big{)}=\left\|{C^{*}v_{2}}\right\|_{Q}^{2}+\left\|{Iv_{1}}\right\|_{W}^{2}=\left|{v}\right|^{2}.

c\big{(}(v_{1},v_{2}),(-v_{1},v_{2})\big{)}=\left\|{C^{*}v_{2}}\right\|_{Q}^{2}+\left\|{Iv_{1}}\right\|_{W}^{2}=\left|{v}\right|^{2}.

\forall v \in V ∣ v ∣ \leq M ∥ v ∥

\forall v \in V ∣ v ∣ \leq M ∥ v ∥

M := max {M_{I}, M_{C}},

M := max {M_{I}, M_{C}},

∥ v ∥ = 1, ∥ φ ∥ = 1 sup ∣ a (v, φ) ∣ = M_{a} and ∥ v ∥ = 1 inf p ∥ φ ∥ = 1 sup a (v, φ) = m_{a}

∥ v ∥ = 1, ∥ φ ∥ = 1 sup ∣ a (v, φ) ∣ = M_{a} and ∥ v ∥ = 1 inf p ∥ φ ∥ = 1 sup a (v, φ) = m_{a}

∥ v_{1} ∥_{1} = 1 inf p ∥ φ_{2} ∥_{2} = 1 sup a (v_{1}, φ_{2}) = ∥ v_{2} ∥_{2} = 1 inf p ∥ φ_{1} ∥_{1} = 1 sup a (φ_{1}, v_{2})

∥ v_{1} ∥_{1} = 1 inf p ∥ φ_{2} ∥_{2} = 1 sup a (v_{1}, φ_{2}) = ∥ v_{2} ∥_{2} = 1 inf p ∥ φ_{1} ∥_{1} = 1 sup a (φ_{1}, v_{2})

∣ b (v, φ) ∣ \leq M_{a} ∥ v ∥ ∥ φ ∥ + \frac{M}{α} ∥ v ∥ ∣ φ ∣ \leq ∥ v ∥ ∥ φ ∥_{α}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Quasi-best approximation

in optimization

with PDE constraints

Fernando Gaspoz

Technische Universität Dortmund, Fakultät für Mathematik, Vogelpothsweg 87, 44227 Dortmund, Germany.

[email protected]

,

Christian Kreuzer

Technische Universität Dortmund, Fakultät für Mathematik, Vogelpothsweg 87, 44227 Dortmund, Germany.

[email protected]

,

Andreas Veeser

Dipartimento di Matematica ’F. Enriques’, Università degli Studi di Milano, Via C. Saldini, 50, 20133 Milano, Italy.

[email protected]

and

Winnifried Wollner

Technische Universität Darmstadt, Fachbereich Mathematik, Dolivostr. 15, 64293 Darmstadt, Germany.

[email protected]

Abstract.

We consider finite element solutions to quadratic optimization problems, where the state depends on the control via a well-posed linear partial differential equation. Exploiting the structure of a suitably reduced optimality system, we prove that the combined error in the state and adjoint state of the variational discretization is bounded by the best approximation error in the underlying discrete spaces. The constant in this bound depends on the inverse square-root of the Tikhonov regularization parameter. Furthermore, if the operators of control-action and observation are compact, this quasi-best-approximation constant becomes independent of the Tikhonov parameter as the meshsize tends to [math] and we give quantitative relationships between meshsize and Tikhonov parameter ensuring this independence. We also derive generalizations of these results when the control variable is discretized or when it is taken from a convex set.

1. Introduction

Optimization problems with PDE constraints are ubiquitous. A basic, and regularly considered, example is

[TABLE]

where $\left|{\cdot}\right|_{0}$ denotes the $L^{2}$ -norm over some underlying domain, $u_{d}$ is the desired state and $\alpha>0$ scales the cost of the control. Additionally, constraints on the control $q$ and/or the state $u$ can be added, and the error due to a discretization of the state equation, and possibly the control, have been analyzed. For piecewise constant discretizations of the control this has been done in [9, 12] including possible box-constraints on the control variable, see also the summary of obtainable convergence orders including Neumann-control in [16]. The consideration of element wise linear functions for the control has been done in [3, 21] in the presence of control constraints.

In [14] it was observed, that the minimization problem could be solved without prescribing a discretization of the control since the control can be recovered from the optimality condition and thus a discretization of the control is induced by the discretization for the state equation. With this $O(h^{2})$ convergence for the control in $L^{2}$ could be shown even in the presence of box control-constraints. It was observed by [18] that the same convergence order can be obtained if a discretized control is used and a post-processing step based upon the optimality conditions is applied.

Due to the structure of the objective in (1.1) these above mentioned estimates make use of the ‘natural norm’

[TABLE]

Although this norm is natural due to the functional, it induces a scaling $\sqrt{\alpha}$ in all estimates involving the control. Further estimates, for instance of $H^{1}$ -norms of the state thereby also contain this scaling. Moreover, the above ‘natural norm’ is not balanced in terms of approximation accuracy, i.e., the error of the state in $L^{2}$ will typically decay at least as fast as the error of the control.

The later effect, however, is invisible as long as the approximation accuracy of both terms is limited by the selected discrete spaces, and not by the regularity of the solutions, as it is typically the case for the model (1.1). However, in the presence of pointwise constraints on the state, see, e.g., [2, 7, 17, 8, 19] or the gradient of the state [6, 13, 20, 25] optimal order estimates can only be obtained for the control variable; while numerics shows a faster convergence of the error in the state variable in $L^{2}$ .

As an alternative to the aforementioned works, one may combine the error in the state with error in the (suitably rescaled) adjoint state, measuring both in the norms that are given by the functional analytic set-up of the PDE constraint. For problem (1.1), this leads to the norm

[TABLE]

where $\left|{\cdot}\right|_{1}$ denotes the $H^{1}_{0}$ -norm. For respective counterparts of (1.2), Chrysafinos and Karatzas [5, 4] prove so-called symmetric error estimates or quasi-best approximation results. The growth of the quasi-best-approximation constant is limited by $\alpha^{-2}$ and $\alpha^{-3/2}$ , respectively.

In this article, we prove abstract quasi-best approximation results, where the discretization error is measured in a counterpart of (1.2). In order to illustrate our results, assume that the underlying domain is convex, let $(V_{h})_{h}$ be a sequence of conforming finite-dimensional spaces that approximates $H^{1}_{0}$ , and consider the variational discretization of (1.1). If we denote by $x_{h}=(u_{h},p_{h})$ the pairs of approximate primal and dual states, our results yield (cf. Theorem 3.2 and Example 3.8)

[TABLE]

with

[TABLE]

Here $C_{F}$ is the constant in the Friedrichs inequality and $C_{\mathcal{I}}$ is an interpolation constant depending on the shape regularity on the underlying meshes. In contrast to the first, non-asymptotic relationship, the second, asymptotic one exploits the compactness of the observation and control-action operators and elliptic regularity theory. Notably, the latter reveals that Céa’s lemma, which holds for the constraint discretization, is recovered as $h\to 0$ and, in particular, ensures an approximation quality independent of $\alpha$ for $h=O(\sqrt{\alpha})$ .

The rest of the paper proceeds as follows. In Section 2, we state precisely the considered problem class, allowing for any linear, bounded, and inf-sup-stable operator in the constraint. Furthermore, we reduce the optimality system by eliminating the control, and we lay the groundwork for our results by a careful discussion of the continuity and nondegeneracy properties of the associated bilinear form.

Section 3 constitutes the core of this work and establishes quasi-best approximation for the variational discretization. To this end, the variational discretization is viewed as a Petrov-Galerkin method and we employ the formula for the quasi-best-approximation constant in Tantardini and Veeser [23]. For the asymptotic behavior of the quasi-best-approximation constant, we additionally invoke a duality argument, which is similar to, but simpler than, Schatz [22].

The last two sections center on generalizations of these results. In Section 4, we consider approximate control-action operators, covering in particular the discretization of the control variable. Finally, Section 5, deals with nonlinear optimality systems arising from additional convex constraints for the control. The derived results complement those of the linear case and the simplification of Schatz’ argument comes in quite useful.

2. Model optimization problem and reduced optimality system

We introduce our model optimization problem. Assume that the control variable $q$ is taken from a real Hilbert space $Q$ with scalar product $\left({\cdot},{\cdot}\right)_{Q}$ and induced norm $\left\|{\cdot}\right\|_{Q}$ . Its corresponding state $u\in V_{1}$ is determined by solving a linear boundary value problem of the form

[TABLE]

with the following setting:

•

The state space $V_{1}$ is a Hilbert space with scalar product $\left({\cdot},{\cdot}\right)_{1}$ and induced norm $\left\|{\cdot}\right\|_{1}$ . Its dual and the corresponding duality pairing are indicated with $V_{1}^{*}$ and $\left\langle{\cdot},{\cdot}\right\rangle_{1}$ , respectively.

•

The differential operator $A$ is induced by bilinear form $a\colon V_{1}\times V_{2}\to{\mathbb{R}}$ , where $V_{2}$ is a second Hilbert space with scalar product $\left({\cdot},{\cdot}\right)_{2}$ , induced norm $\left\|{\cdot}\right\|_{2}$ , dual space $V_{2}^{*}$ , and dual pairing $\left\langle{\cdot},{\cdot}\right\rangle_{2}$ . We assume that the bilinear form $a$ is bounded and satisfies the following inf-sup conditions:

[TABLE]

Employing well-known inf-sup theory (cf., e.g., Babuška [1]), we see that the operator $A\colon V_{1}\to V_{2}^{*},v_{1}\mapsto a(v_{1},\cdot)$ is linear and boundedly invertible.

•

The control-action operator $C\colon Q\to V_{2}^{*}$ is linear and bounded with constant $M_{C}$ .

Our goal is then to numerically solve the constrained optimization problem

[TABLE]

where we assume in addition:

•

The desired “state” $u_{d}$ is an element of a Hilbert space $W$ with scalar product $\left({\cdot},{\cdot}\right)_{W}$ and induced norm $\left\|{\cdot}\right\|_{W}$ .

•

The observation operator $I\colon V_{1}\to W$ is linear, and bounded with constant $M_{I}$ .

•

The cost of the control, which can be viewed as a Tikhonov regularization, is scaled with the parameter $\alpha>0$ .

Problem (2.3) is a quadratic minimization problem with a linear constraint. The objective function is convex with respect to

[TABLE]

and strictly convex in $q$ . Consequently, standard arguments ensure the existence of a unique solution; see, e.g., Lions [15, Theorem 1.1] or Tröltzsch [24, Chapter 2.5].

If $Q=L^{2}=W$ , $V_{1}=V_{2}=H^{1}_{0}$ , $A=-\Delta$ is the (weak) Laplacian, and $C$ and $I$ are the canonical compact immersions $L^{2}\to\big{(}H^{1}_{0}\big{)}^{*}$ and $H^{1}_{0}\to L^{2}$ , then (2.3) simplifies to the optimization problem (1.1) in the introduction. Notice that, in this case, the operators $C$ and $I$ are related by $C^{*}=I$ .

To formulate the optimality system for (2.3), it is useful to define the adjoint operators $A^{*}$ , $C^{*}$ , $I^{*}$ of $A$ , $C$ , $I$ by

[TABLE]

for all $v_{1}\in V_{1}$ , $v_{2}\in V_{2}$ , $q\in Q$ , $w\in W$ . Thanks to the convexity of the problem (2.3), a pair $(q,u)\in Q\times V_{1}$ is a minimum point if and only if there exists $p\in V_{2}$ such that

[TABLE]

We may eliminate $q$ by inserting the last equation into the first one and multiplying the second equation by $\beta>0$ . We thus obtain the following reduced optimality system for the pair $(u,p)\in V_{1}\times V_{2}$ :

[TABLE]

Notice that the second row of equations, $Au+\tfrac{1}{\alpha}CC^{*}p=0$ , suggests scaling the adjoint state $p$ by the factor $\frac{1}{\alpha}$ , while the first row, $-\beta I^{*}Iu+\beta A^{*}p=-\beta I^{*}u_{d}$ , suggests no scaling at all. As a compromise, we propose to use $z=\tfrac{1}{\sqrt{\alpha}}p$ and $\beta=\tfrac{1}{\sqrt{\alpha}}$ .

We thus transform the optimality system (2.5) into

[TABLE]

and the reduced optimality system (2.6) into

[TABLE]

This rescaled and reduced optimality system deviates from the usual KKT-formulation, but has an interesting structure. As the KKT-formulation, it is symmetric also for non-symmetric $A$ . The off-diagonal consists of two interrelated invertible operators, while the diagonal entries are (semi-)definite, symmetric operators. Notice that, upon inverting the rows, the roles of the diagonal and off-diagonal can be exchanged. For the optimization problem (1.1), the operator matrix is then diagonally dominant in that $CC^{*}$ and $I^{*}I$ are compact operators.

Let us give a weak formulation of the rescaled and reduced optimality system. Its rows are equivalently written as

[TABLE]

and so we are led to introduce the Hilbert space

[TABLE]

and the bilinear form $b\colon V\times V\to{\mathbb{R}}$ given by

[TABLE]

for $v=(v_{1},v_{2}),\varphi=(\varphi_{1},\varphi_{2})\in V$ . Note that we use the same letter $a$ for the bilinear form inducing the operator $A$ and for the one in (2.10b); this “operator overloading” should not cause confusion when the domain is clear. If not, we shall distinguish the two forms by writing $a_{|V_{1}\times V_{2}}$ or $a_{|V\times V}$ . In this notation, the variational formulation of the rescaled and reduced optimality system (2.8) simply reads

[TABLE]

A pair $x=(u,z)\in X$ is a solution of (2.11) if and only if $(u,z)$ is a solution of (2.9) if and only if the triple $(u,z,-\tfrac{1}{\sqrt{\alpha}}C^{*}z)\in V\times Q$ verifies the rescaled optimality system (2.7). Consequently, thanks to the convexity of (2.3), if $x=(u,z)\in V$ is a solution of (2.11), then $(-\tfrac{1}{\sqrt{\alpha}}C^{*}z,u)\in Q\times V_{1}$ is a solution of the original optimization problem (2.3).

Let us analyze the bilinear form $b=a+\frac{1}{\sqrt{\alpha}}c$ . We readily see that

[TABLE]

but $b$ is not coercive in general. Consider, for example, a set-up where there exists $v=(v_{1},v_{2})\in V$ such that $\left\|{Iv_{1}}\right\|_{W}>\left\|{C^{*}v_{2}}\right\|_{Q}$ . Then $c$ is not coercive and so, even for $a$ coercive, also $b$ is not coercive for $\alpha>0$ sufficiently small.

In order to obtain further properties, let us first consider the contributions $a$ and $c$ separately. The bilinear form $c$ is closely related to the original minimization problem (2.3) and its “energy seminorm” (2.4). To see this, observe that, if $(u,z)\in V$ and $\sqrt{\alpha}q=-C^{*}z$ , we have the correspondence

[TABLE]

which motivates to introduce the seminorm

[TABLE]

on $V$ . Thus, denoting by $Z$ the kernel of $\left|{\cdot}\right|$ and realizing that the bilinear form $c$ is well-defined on the quotient space $V/Z$ , we see that

[TABLE]

where the second identity relies on

[TABLE]

Since

[TABLE]

with

[TABLE]

the form $c$ is also continuous in $V$ , with constant $M$ .

The bilinear form $a_{|V\times V}$ inherits its continuity and nondegeneracy properties from $a_{|V_{1}\times V_{2}}$ . More precisely, we have

[TABLE]

with $M_{a}$ and $m_{a}$ from (2.2). While the first identity is straight-forward, the second one hinges on the inf-sup-duality (cf. Babuška [1])

[TABLE]

for $a$ with domain $V_{1}\times V_{2}$ .

Turning to the complete bilinear form $b$ , we may sum up the continuity properties as follows: for all $v,\varphi\in V$ , we have

[TABLE]

with

[TABLE]

Here we have equipped $V$ as trial space with $\left\|{\cdot}\right\|$ and as test space with $\left\|{\cdot}\right\|_{\alpha}$ . The former is in accordance with our scopes in the error analyses below and the latter avoids in particular a dependence on $M/\sqrt{\alpha}$ of the continuity constant of $b$ and in the following bound for the right-hand side in (2.11): for all $\varphi=(\varphi_{1},\varphi_{2})\in V$ ,

[TABLE]

The derivation of the nondegeneracy properties of the bilinear form $b$ is more subtle. In order to establish the crucial inf-sup condition (2.2c), let $\varphi=(\varphi_{1},\varphi_{2})\in V$ be given.

In order to find a suitable $v=(v_{1},v_{2})\in V$ , we combine the nondegeneracy properties of $a$ and $c$ in the ansatz

[TABLE]

thanks the continuity (2.14) of $c$ and $m_{a}\leq M_{a}$ . Using the inequality $2st\leq\epsilon s^{2}+t^{2}/\epsilon$ with $\epsilon=\frac{L}{1+2L}m_{a}>0$ and

[TABLE]

and recall (2.22b), we arrive at

[TABLE]

where the norms on the right-hand side coincide with those in the continuity bound (2.19). We therefore have the following basic result.

Theorem 2.1 (Bilinear form of reduced optimality system).

If we equip $V$ as trial space with $\left\|{\cdot}\right\|$ and as test space with $\left\|{\cdot}\right\|_{\alpha}$ , then the inf-sup constant $m_{b}$ and the continuity constant $M_{b}$ of the bilinear form (2.10) satisfy

[TABLE]

where $\kappa$ is defined by the relations (2.23).

The inequalities of Theorem 2.1 yield for the condition number of the bilinear form $b$ (i.e., the ratio of its continuity constant to its inf-sup constant)

[TABLE]

The second factor, the condition number of the bilinear form $a$ associated with the constraint, is expected to be a kind of lower bound. In this vein, we may view the first factor $\kappa$ as a bound for the possible amplification of the constraint conditioning, resulting from the interplay of constraint and the objective in the constrained optimization problem (2.3). Inspecting (2.23), we see that $\kappa$ is a function of the parameters $\alpha$ , $M$ , $m_{a}$ , and $M_{a}$ . The next three remarks discuss asymptotic behaviors of $\kappa$ that will play major roles in what follows or are of independent interest.

*Remark 2.2** (Amplification for pure constraint case).*

Consider the special case $C=0$ and $I=0$ . Then the rescaled and reduced optimality system (2.8) is a well-posed ‘double’ boundary value problem. Its condition number with respect to $(V,\left\|{\cdot}\right\|)\times(V,\left\|{\cdot}\right\|)$ is $M_{a}/m_{a}$ ; cf. (2.17). As $C=0$ and $I=0$ imply $M=0$ , $L=0$ , and so $\gamma=0$ and $\kappa=1$ , this is reproduced by Theorem 2.1.

It is worth mentioning that this limiting case of “pure constraint” is attained in a continuous manner:

[TABLE]

where $L=M/\sqrt{\alpha}$ is essentially the operator norm of the perturbation.

*Remark 2.3** (Amplification for degenerating constraint).*

While the continuity constant $M_{a}$ of the bilinear form $a$ does not enter $\kappa$ , its inf-sup constant $m_{a}$ does, in a critical manner. More precisely, we have

[TABLE]

Notice that the fraction involving $L$ has only values in the interval $[1,2]$ .

*Remark 2.4** (Amplification for vanishing regularization).*

Consider the limit $\alpha\to 0$ of the Tikhonov regularization parameter (while $I$ and $C$ are fixed). Then $L\to\infty$ so that

[TABLE]

Let us see with a simple example that the inf-sup constant $m_{b}$ in Theorem 2.1 can blow up with this rate and so the lower bound therein cannot be improved for small $\alpha$ without further assumptions on the structure of $b$ .

Consider $V_{1}=V_{2}={\mathbb{R}}^{2}$ , where $\left\|{\cdot}\right\|_{1}$ and $\left\|{\cdot}\right\|_{2}$ are the Euclidean norm in ${\mathbb{R}}^{2}$ ,

[TABLE]

and $\alpha>0$ . The symmetric bilinear form $b$ of the optimality system is then given by the matrix

[TABLE]

For $\varphi_{0}=(\sqrt{\alpha},0,1,0)\in V={\mathbb{R}}^{4}$ , we have $\left\|{\varphi_{0}}\right\|_{\alpha}=\sqrt{1+\alpha}+1$ and

[TABLE]

so that

[TABLE]

Hence, the asymptotic behavior of $\alpha$ in (2.25) is attained.

The chosen norms for $V$ as trial and test space are not always the most convenient ones. This follows from the following remark considering a special case.

*Remark 2.5** (Coercive constraints with $C^{}=I$ ).

Suppose that $V_{1}=V_{2}$ and $Q=W$ with coinciding scalar products and norms and that the bilinear form $a_{|V_{1}\times V_{1}}$ is coercive with constant $\tilde{m}_{a}$ and $C^{*}=I$ . It is worth noting that, as $a_{|V_{1}\times V_{1}}$ is not necessarily symmetric, the best coercivity constant $\tilde{m}_{a}$ may be much smaller than the inf-sup constant $m_{a}$ . Given $\varphi\in V$ , we proceed as in (2.22) taking $w=\varphi$ , $\gamma=0$ , and obtain

[TABLE]

Hence, in this case, the condition number of $b$ with respect to the norms in (2.27) is independent of the Tikhonov regularization parameter $\alpha$ . Nevertheless, if $C^{*}\neq I$ , also this choice of norms cannot offer in general an asymptotic behavior better than $1/\sqrt{\alpha}$ as $\alpha\to 0$ . In fact, re-computing the example in Remark 2.4 with the norms in (2.27) does not change the behavior of its inf-sup constant.

Let us conclude this section with the following side product of our discussion of the bilinear form $b$ .

Corollary 2.6 (Existence and uniqueness).

*The rescaled and reduced optimality system (2.11) and thus (2.5) has a unique solution. *

Proof.

Inequality (2.24) ensures (2.2c) for the bilinear form $b$ and, thanks to the algebraic symmetry of $b$ , also (2.2b). ∎

3. Analysis for variational discretization

In this section, we analyze the error of the variational discretization of the optimization problem (2.3) according to Hinze [14]. Our key tool is the rescaled and reduced optimality system (2.8), whose Galerkin solution coincides with the approximate solution of the variational discretization.

3.1. Variational discretization and reduced optimality system

We start by discretizing the PDE constraint (2.1) of the optimization problem (2.3). Recalling its variational formulation

[TABLE]

we choose some conforming finite-dimensional spaces $V_{h,i}\subset V_{i}$ , $i=1,2$ , such that the restriction of the bilinear form $a$ on $V_{h,1}\times V_{h,2}$ is nondegenerate. The corresponding Petrov-Galerkin method then reads

[TABLE]

Using this for the constraint in (2.3), we arrive at the (semi-)discrete optimization problem

[TABLE]

where we, in addition, assume that $I$ can be exactly evaluated for any function from $V_{h,1}$ . As in the continuous case, $(\tilde{q},u_{h})\in Q\times V_{h,1}$ is the unique solution of (3.1) if and only if there exists $z_{h}\in V_{h,2}$ such that

[TABLE]

Also here, we may eliminate the approximate control $\tilde{q}$ by inserting the third equation into the first one. Setting $V_{h}:=V_{h,1}\times V_{h,2}$ , the variational formulation of the ensuing discrete rescaled and reduced optimality system is

[TABLE]

Its solution $x_{h}$ is the Galerkin approximation in $V_{h}$ to the solution $x$ of the variational formulation (2.11) of the rescaled and reduced optimality system. Applying Corollary 2.6 to the discrete spaces therefore yields the following approach to uniqueness and existence of the variational discretization of (2.11).

Lemma 3.1 (Discrete well-posedness).

The discrete reduced optimality system (3.3) has a unique variational solution $x_{h}=(u_{h},z_{h})\in V_{h}$ . Consequently, the pair $(\tilde{q},u_{h})$ with $\tilde{q}=-\tfrac{1}{\sqrt{\alpha}}C^{*}z_{h}$ is the unique solution of the semidiscrete optimization problem (3.1).

Remarkably, the approximate solutions $(\tilde{q},u_{h},z_{h})$ of the variational discretization (3.2) are computable whenever $C^{*}_{|V_{h,2}}$ and $I_{|V_{h,1}}$ can be evaluated exactly.

3.2. Non-asymptotic quasi-best approximation

We shall assess the quality of the Galerkin approximation $x_{h}=(u_{h},z_{h})\in V_{h}$ from (3.3), assuming that we are interested particularly in the $\left\|{\cdot}\right\|_{1}$ -error of the approximate state $u_{h}$ . For this purpose, we compare it with a suitable best error in $V_{h}$ .

Let us first recall some basic results in Petrov-Galerkin approximation, which we already formulate for the discretization of the constraint. Let $R_{h,1}v_{1}\in V_{h,1}$ be the generalized Ritz projection of $v_{1}\in V_{1}$ given by $a(R_{h,1}v_{1},\varphi_{h,2})=a(v_{1},\varphi_{h,2})$ for all $\varphi_{h,2}\in V_{h,2}$ . Since $a_{|V_{1}\times V_{2}}$ satisfies (2.2) and is nondegenerate on $V_{h,1}\times V_{h,2}$ , there exists a constant $\mu_{h}\geq 1$ such that

[TABLE]

see, e.g., Babuška [1]. We refer to the smallest possible choice of $\mu_{h}$ as the quasi-best-approximation constant of the constraint discretization. Xu and Zikatanov [26] show the identities

[TABLE]

and Tantardini and Veeser [23, Theorem 2.1] give the formula

[TABLE]

where $v_{1}$ varies in $V_{1}$ and $v_{h,1}$ varies in $V_{h,1}$ and, for the sake of notational simplicity, a tedious $\varphi_{h,2}\neq 0$ is avoided.

A perhaps striking feature of these formulas is that they are not affected by the choices of the norms in the test spaces $V_{h,2}$ and $V_{2}$ . This comes in quite useful in our context, as the adjoint state is an auxiliary variable and, in the original approximation problem (2.3), the norm $\left\|{\cdot}\right\|_{2}$ is free as long as (2.2) continues to hold with $\left\|{\cdot}\right\|_{1}$ . Exploiting this freedom, we propose to (possibly) redefine the norm on the space $V_{2}$ by

[TABLE]

and so, in particular, to measure the error of the approximate adjoint state $z_{h}$ in this norm. This redefinition affects the constants that we associated with the constrained optimization problem (2.3). The new continuity and inf-sup constants of the bilinear forms $a_{|V_{1}\times V_{2}}$ are

[TABLE]

The constant $M_{I}$ is not affected, while we have

[TABLE]

where we indicate quantities before the redefinition by an additional index “old”. As in addition

[TABLE]

the results below hold also with the original norm in $V_{2}$ , but the constants have to be revisited.

The convenience of the choice (3.6) lies in the following consequences of (3.7). The numerator in (3.5) is $\left\|{\varphi_{h,2}}\right\|_{2}$ , which, together with the inf-sup-duality, cf. (2.18), yields

[TABLE]

for the inf-sup constant of $a_{|V_{h,1}\times V_{h,2}}$ . Accordingly, the generalized Ritz projection $R_{h,2}v_{2}\in V_{h,2}$ of $v_{2}\in V_{2}$ given by $a(\varphi_{h,1},R_{h,2}v_{2})=a(\varphi_{h,1},v_{2})$ for all $\varphi_{h,1}\in V_{h,1}$ verifies

[TABLE]

Setting $R_{h}=(R_{h,1},R_{h,2})$ , we also have

[TABLE]

After these preparations, we are ready to derive a first result about quasi-best approximation of the variational discretization (3.1).

Theorem 3.2 (Non-asymptotic quasi-best approximation).

Let $x=(u,z)$ be any solution of the optimality system (2.11) and choose (3.6) as norm in $V_{2}$ . The combined error in the corresponding approximate state $u_{h}$ and its adjoint $z_{h}$ of the variational discretization is quasi-best in $V_{h}$ with

[TABLE]

Here

[TABLE]

and $\mu_{h}$ is the quasi-best-approximation constant of the constraint discretization.

Proof.

Thanks to Theorem 2.1 and Lemma 3.1, we can use the counterpart of (3.5) for the characterization (3.3) of the variational discretization. Let $\varphi_{h}\in V_{h}$ . The continuity bound (2.19) and (3.7) give for the numerator

[TABLE]

For the denominator, we use (2.22), where $V$ is replaced by $V_{h}$ and, therefore, with $1/\mu_{h}$ in place of $m_{a}$ in view of (3.9). We thus obtain

[TABLE]

and the proof is finished. ∎

In the special situation of Remark 2.5, we can obtain the following quasi-best approximation result.

*Remark 3.3** (Quasi-best approximation for coercive constraints and $C^{}=I$ ).

Suppose that $V_{1}=V_{2}$ and $Q=W$ with coinciding scalar products and norms and that the bilinear form $a$ is $V_{1}$ -coercive with constant $\tilde{m}_{a}$ and $C^{*}=I$ . Exploiting the coercivity and continuity properties of Remark 2.5, we derive for the error of the variational discretization (2.11)

[TABLE]

The quasi-best approximation constant in the preceding Remark 3.3 does not blow up for vanishing regularization. Nonetheless, when measuring the error merely with $\left\|{\cdot}\right\|$ , it does not exclude an $\alpha^{-1/4}$ -blow up of the quasi-best approximation constant even in the special case $C^{*}=I$ considered in Remark 2.4 and, in the light of the example therein, it does not exclude an $\alpha^{-3/4}$ -blow up for general operators $I$ and $C$ . As we shall see, the $\alpha$ -dependence in Theorem 3.2 is less severe.

*Remark 3.4** (Vanishing regularization and quasi-best approximation).*

As in Remark 2.4, we consider the limit $\alpha\to 0$ for the Tikhonov regularization parameter. Similarly to there, we have

[TABLE]

This blow up arises from the lower bound of the inf-sup constant in Theorem 2.1, which cannot be improved because of (2.26). Note however, that the equivalence of the norms $\left\|{\cdot}\right\|_{\alpha}$ and $\sup_{\left\|{v}\right\|=1}b(v,\cdot)$ is not uniform in $\alpha$ . In the light of (3.5), it is therefore conceivable that (3.12) could be improved by using the latter as test space norm. However, the determination of the discrete inf-sup constant with respect to this abstract norm appears to be much more involved than the approach (2.22), which directly carries over to discrete spaces.

In any case, we shall show below that, under refinement, the $\alpha$ -dependence disappears for many instances of the optimality system (2.7).

3.3. Asymptotic quasi-best approximation

In this section, we complement Theorem 3.2. To be more precise, let $\nu_{h}$ be the quasi-best-approximation constant of the variational discretization therein and consider a sequence $(V_{h})_{h}$ of discrete spaces leading to a uniform stable constraint discretization in that

[TABLE]

which is equivalent to discrete inf-sup stability in view of (3.9). Theorem 3.2 then ensures the existence of a constant $\bar{\nu}$ such that

[TABLE]

This upper bound may be pessimistic. To motivate this assessment, represent the bilinear form $b$ by the operator matrix

[TABLE]

which is the one in (2.8) with inverted rows. If $C$ and $I$ are compact, this matrix is diagonally dominant in an operator sense and can be viewed as a compact perturbation of the diagonal matrix with the entries $A$ and $A^{*}$ . Therefore, in order to improve on (3.14), we mimic somewhat the argument in Schatz [22], introducing some new twist.

Let us first observe that, in accordance with Remark 2.2, Theorem 3.2 yields $\nu_{h}\leq\mu_{h}$ whenever $M_{I}=0=M_{C}$ . More precisely and generally, we have the following relationship between the two quasi-best-approximation constants.

Lemma 3.5 (Quasi-best-approximation constants).

The quasi-best-approximation constants $\nu_{h}$ and $\mu_{h}$ are related by

[TABLE]

where $\kappa_{h}$ is as in Theorem 3.2 and $R_{h}$ is the generalized Ritz projection in (3.10).

Proof.

As in the proof of Theorem 3.2, we will make use of (3.5) with $a$ replaced by $b$ . Given $v\in V$ and $\varphi_{h}\in V_{h}$ , we can write

[TABLE]

because of $a(v-R_{h}v,\varphi_{h})=0$ . Hence,

[TABLE]

As

[TABLE]

with equality for some $\varphi_{h}\in V_{h}$ , we obtain

[TABLE]

Thanks to (2.14), (2.20), and (3.11) this proves the claimed inequality. ∎

In order to deploy Lemma 3.5, we need additional assumptions for our optimization problem and its discretization. We shall consider two settings: a “qualitative” and a “quantitative” one. The former assumes in addition

[TABLE]

for the constraint discretization. Notice that, owing to (3.8), the condition (3.15a) is independent of our choice to equip $V_{2}$ with the norm (3.6).

Lemma 3.6 (Qualitative asymptotic quasi-best approximation).

Under the assumptions (3.13) and (3.15), the quasi-best-approximation constant $\nu_{h}$ satisfies

[TABLE]

where

[TABLE]

Proof.

In the light of Lemma 3.5 and (3.13), it suffices to verify the uniform convergence

[TABLE]

This follows from a standard argument; we provide details for the sake of completeness. Let $(h_{k})_{k}$ be any sequence with $\lim_{k\to 0}h_{k}=0$ and choose $v_{k}$ such that

[TABLE]

where we write $k$ instead $h_{k}$ whenever the latter is an index. Exploiting (3.13) another time, we see that the sequence given by $d_{k}:=v_{k}-R_{k}v_{k}$ is bounded in the Hilbert space $V$ . Owing to (3.15b), its weak limit $d\in V$ satisfies

[TABLE]

for any $\varphi\in V$ and $\varphi_{k}\in V_{k}$ . Choosing $\varphi_{k}$ by means of (3.15b), we derive $a(d,\varphi)=0$ by $k\to\infty$ . Consequently, (2.17) yields $d=0$ . Thanks to (3.15a), the operator $I:V_{1}\to W$ and the adjoint $C^{*}:V_{2}\to Q$ are compact. This turns the weak convergence $d_{k}\to 0$ in $V$ into the strong convergence $\left|{d_{k}}\right|\to 0$ and the proof is finished. ∎

In order to quantify the convergence in Lemma 3.6, we shall use a duality argument. This requires a second, more specific setting of additional assumptions involving the Sobolev spaces $H^{s}$ , $s\geq 0$ , and their norms $\left|{\cdot}\right|_{s}$ over some domain. We use $\left|{\cdot}\right|_{s}$ instead of $\left\|{\cdot}\right\|_{s}$ in order to avoid confusion with the norms $\left\|{\cdot}\right\|_{1}$ and $\left\|{\cdot}\right\|_{2}$ of $V_{1}$ and $V_{2}$ . For $s<0$ , we denote by $H^{s}$ the (topological) dual space of $H^{-s}$ and $\left|{\cdot}\right|_{s}$ stands for the dual norm of $\left|{\cdot}\right|_{-s}$ .

We suppose that spaces $V_{1}$ and $V_{2}$ relate to Sobolev spaces in the following way: There are $s_{i}\in{\mathbb{R}}$ , $i=1,2$ , and a constant $C_{S}\geq 1$ such that

[TABLE]

for some constant $C_{\mathcal{I}}>0$ , which quantifies the approximation property (3.15b).

Theorem 3.7 (Quantitative asymptotic best approximation).

Under the assumptions (3.13) and (3.17), the quasi-best-approximation constant $\nu_{h}$ satisfies

[TABLE]

where $\bar{\kappa}$ is as in Lemma 3.6. For the $\alpha$ -dependence of $\bar{\kappa}$ , cf. Remark 3.4.

Proof.

Similarly as in the first step of the proof of Lemma 3.6, inserting (3.13) and

[TABLE]

into Lemma 3.5 establishes the claim. To show (3.18), let $v\in V$ with $\left\|{v}\right\|=1$ and define $\varphi\in V$ as the solution of the following “dual” problem associated with the bilinear form $a_{|V\times V}$ :

[TABLE]

where $d=(d_{1},d_{2}):=v-R_{h}v$ . We thus have

[TABLE]

where $\varphi_{h}\in V_{h}$ is arbitrary. For the first factor, (3.10) and (3.13) imply

[TABLE]

For second factor, we employ (3.17d) with suitable $\varphi_{h}\in V_{h}$ to obtain

[TABLE]

and it remains to show that the norms on the right-hand side are suitably bounded. Let consider the first one. Making use of the regularity estimate (3.17c) and the definition of $\varphi_{1}$ , we deduce

[TABLE]

where $\bar{M}_{C}$ is the operator norm of $C$ from (3.17b). A similar argument yields

[TABLE]

where $\bar{M}_{I}$ is the operator norm of $I$ in (3.17b). We insert the previous estimates in the first one and conclude

[TABLE]

with $\bar{M}:=\max\{\bar{M}_{I},\bar{M}_{C}\}$ , i.e., (3.18). ∎

Let us exemplify Theorem 3.7 by two applications. The first one considers the optimization problem (1.1) of the introduction, while the second one is more involved in that the constraint does not allow for a coercive set-up.

*Example 3.8** (Simple model optimization).*

Discretize the optimization problem (1.1) of the introduction with linear finite elements on quasi-uniform meshes with meshsize $h$ . We have $V_{1}=H^{1}_{0}=V_{2}$ and, if we choose $\left\|{\cdot}\right\|_{1}=\left|{\nabla\cdot}\right|_{0}$ , we already have $m_{a}=1=M_{a}$ and (3.6) does not change the norm in $V_{2}$ . Further, $M_{I}=C_{F}=M_{C}$ , where $C_{F}$ is the constant in the Poincaré-Friedrichs inequality. Moreover, we have $s_{1}=1=s_{2}$ and, assuming that the underlying domain is convex, $\delta=1$ . Taking Sobolev seminorm instead of norms in (3.17a), we then have $C_{S}=1$ for the relevant cases and $C_{R}=1$ thanks to elliptic regularity as well as $\bar{M}_{I}=1=\bar{M}_{C}$ . Standard approximation theory shows (3.17d) with $C_{\mathcal{I}}$ depending on the shape regularity of the underlying meshes. Since $\mu_{h}=1$ , we conclude

[TABLE]

for the quasi-best-approximation constant of the variational discretization in this case.

*Example 3.9** (Point source control).*

We consider the following modification of the optimization problem (1.1), where the distributed control is replaced by a finite number of point sources:

[TABLE]

where the underlying domain $\Omega\subset{\mathbb{R}}^{2}$ is planar, polygonal, Lipschitz, but not necessarily convex, $\{x_{j}\}_{j=1}^{\ell}\subset\Omega$ are $\ell$ distinct points, $\delta_{x_{j}}$ denotes the Dirac functional at the point $x_{j}$ , and $0<\sigma<\frac{1}{2}$ . The bilinear form $a(v,w)=\int_{\Omega}\nabla v\cdot\nabla w\,dx$ , $v,w\in C^{\infty}_{0}(\Omega)$ , has a continuous and inf-sup-stable extension on $V_{1}\times V_{2}$ with $V_{1}=H^{1-\sigma}_{0}(\Omega)$ and $V_{2}=H^{1+\sigma}_{0}(\Omega)$ and allows for a standard discretization with linear finite elements $S_{h}$ for both trial and test space; see, e.g., [11]. For the verification of the discrete inf-sup condition, denote by $R_{h}$ and $\Lambda_{h}$ the Ritz projection and the Scott-Zhang interpolation operator, respectively. As

[TABLE]

and

[TABLE]

the continuous inf-sup-condition yields, for any $s_{h}\in S_{h}$ ,

[TABLE]

and so

[TABLE]

where $\mu_{h}$ depends only on continuous inf-sup constant and on the shape regularity of the underlying mesh and we switched to (3.6) for the norm on $V_{2}$ . To complete the setting, we set $W=L^{2}(\Omega)$ , $Q={\mathbb{R}}^{\ell}$ , and let $I$ be the canonical embedding $H^{1-\sigma}(\Omega)\to L^{2}(\Omega)$ and $C:{\mathbb{R}}^{\ell}\to H^{-(1+\sigma)}(\Omega)$ be given by $Cq=\sum_{j=1}^{\ell}q_{j}\delta_{x_{j}}$ . The continuity constants $M_{I}$ and $M_{C}$ are of order $1$ and $\ell$ , respectively. Notice that, for $\sigma=0$ , $C$ is not continuous because functions in $H^{1}_{0}(\Omega)$ do not have point values in general. Choosing $\delta\in(0,\sigma)$ , we have (3.17) with $s_{1}=1-\sigma$ , $s_{2}=1+\sigma$ and therefore

[TABLE]

4. Analysis with approximate control-action operator

In this section, we shall analyze the approximation properties of a variational discretization, where the control-action operator is approximated. This includes the case of a discretized control space.

4.1. Approximate variational discretization

Let $V_{h,i}\subset V_{i}$ , $i=1,2$ , be the same finite-dimensional conforming spaces introduced in Section 3.1 and assume that the linear operator $C_{h}^{*}:V\to Q$ approximates $C^{*}$ . Then the (semi-)discrete optimization

[TABLE]

generalizes (3.1). It has the solution $(\tilde{q}_{h},\tilde{u}_{h})\in Q\times V_{h,1}$ if and only if there exists $\tilde{z}_{h}\in V_{h,2}$ such that

[TABLE]

As before, we may eliminate $\tilde{q}_{h}$ . If we define

[TABLE]

with

[TABLE]

for $v,\varphi\in V=V_{1}\times V_{2}$ , then the reduced version of (4.2) is the following perturbation of the optimality system (3.3):

[TABLE]

where $V_{h}=V_{h,1}\times V_{h,2}$ . Before we proceed to analyze its discretization error, let us give an important class of examples.

*Example 4.1** (Discretized controls).*

We consider a conforming discretization of the control variable. More precisely, replacing $Q$ in (3.1) with a finite-dimensional subspace $Q_{h}\subset Q$ leads to the discrete optimality system

[TABLE]

If we denote by $P_{h}$ the $Q$ -orthogonal projection onto $Q_{h}$ , then the third equations mean

[TABLE]

and, therefore, the right-hand side of the first equation can be rewritten as follows:

[TABLE]

Hence, the reduced version of (4.4) is a special case of (4.3) with

[TABLE]

As the bilinear form $b_{h}$ coincides with $b$ except for using $C_{h}^{*}$ in place of $C$ , the non-asymptotic continuity and nondegeneracy properties of $b$ in Section 2-3 immediately carry over by replacing $M_{C}$ with the operator norm $M_{C_{h}}$ of $C_{h}^{*}$ . In particular, setting $\tilde{M}_{h}:=\max\{M_{I},M_{C_{h}}\}$ and defining

[TABLE]

inequality (2.19) yields

[TABLE]

for all $v,\varphi\in V$ . Furthermore, (3.11) and the inf-sup duality (2.18) for $b_{h}{}_{|V_{h}\times V_{h}}$ imply

[TABLE]

for all $v_{h}\in V_{h}$ , where

[TABLE]

and $\mu_{h}$ is the quasi-best-approximation constant of the constraint discretization.

Since the structures of the discrete problems (4.3) and (3.3) are the same, well-posedness of (4.3) follows from Lemma 3.1.

4.2. Approximation

As in the error analysis of Section 3.2, we adopt the convenient choice

[TABLE]

Here we start our analysis by splitting the error into an approximation part and a consistency part.

Lemma 4.2 (Approximation and consistency error).

Let $x=(u,z)$ be any solution of the optimality system (2.11) and let $\tilde{x}_{h}$ be its approximation from (4.3). Then the error satisfies

[TABLE]

Here $\tilde{\kappa}_{h}$ is defined by (4.8) and $\mu_{h}$ is the quasi-best-approximation constant of the constraint discretization from (3.10).

Proof.

Define $x_{h}^{*}\in V_{h}$ by

[TABLE]

Then Theorem 3.2 with $b_{h}$ , $x^{*}_{h}$ , $\tilde{\kappa}_{h}$ in place of $b$ , $x_{h}$ , $\kappa_{h}$ gives

[TABLE]

and we have the identities

[TABLE]

for all $\varphi_{h}\in V_{h}$ . In view of (4.6) and (4.7), these identities imply

[TABLE]

The claim follows from the obvious inequalities $\left\|{x-\tilde{x}_{h}}\right\|\leq\left\|{x-x^{*}_{h}}\right\|+\left\|{x^{*}_{h}-\tilde{x}_{h}}\right\|$ and $\inf_{v_{h}\in V_{h}}\left\|{x-v_{h}}\right\|\leq\left\|{x-\tilde{x}_{h}}\right\|$ . ∎

For the next corollary it is necessary to consider a sufficiently large class of optimization problems, e.g., the class $\mathcal{P}$ of optimization problems, where a constraint can be of the form $Au=Cq+f$ for some $f\in V_{2}^{*}$ and $I^{*}$ may be surjective.

Corollary 4.3 (Necessary condition for quasi-best approximation).

If the approximate variational discretization (4.3) is quasi-best in the class $\mathcal{P}$ , then

[TABLE]

Proof.

Let $v_{2,h}\in V_{2,h}$ be arbitrary and take some $v_{1,h}\in V_{1,h}$ . Then $v_{h}=(v_{1,h},v_{2,h})\in V_{h}\subset V$ is a possible solution in the class $\mathcal{P}$ . Since (4.3) is quasi-best in $\mathcal{P}$ , the discrete solution is exactly $v_{h}\in V_{h}$ . Hence, by Lemma 4.2 we have $(C_{h}C_{h}^{*}-CC^{*})v_{2,h}=0$ , which yields $\left\|{C_{h}^{*}v_{h,2}}\right\|_{Q}=\left\|{C^{*}v_{h,2}}\right\|_{Q}$ . ∎

Although possible, it is difficult to imagine that a practical approximation $C_{h}^{*}$ satisfies the condition in Corollary 4.3 without coinciding with $C$ . We therefore consider in what follows only assumptions on $C_{h}^{*}$ that lead to asymptotic quasi-best approximation. In view of Lemma 4.2, this requires, that the consistency error vanishes at least as fast as the best approximation error, i.e.,

[TABLE]

Moreover, to capture in the limit the compactness of $C^{*}$ resulting from assumption (3.15a), we assume that

[TABLE]

This implies that the operator norms $\|C_{h}^{*}\|_{L(V_{2},Q)}=\tilde{M}_{h}=\max\{M_{I},M_{C_{h}}\}$ are uniformly bounded. Indeed, suppose that $\tilde{M}_{h}\to\infty$ as $h\to 0$ and, for each $h>0$ , let $\varphi_{2}^{h}\in V_{2}$ be such that $\|C_{h}^{*}\varphi_{2}^{h}\|_{Q}=\tilde{M}_{h}$ and $\|\varphi_{2}^{h}\|_{2}=1$ . Then $\varphi_{2}^{h}/\tilde{M}_{h}\to 0$ in $V_{2}$ as $h\to 0$ , which, in view of (4.10), yields a contradiction. Consequently,

[TABLE]

is finite.

Lemma 4.4 (Qualitative asymptotic quasi-best approximation with approximate control-action).

Let $x=(u,z)\in V$ be a solution to problem (2.11) and let $\tilde{x}_{h}=(\tilde{u}_{h},\tilde{z}_{h})\in V_{h}$ , $h>0$ , be the corresponding approximations given by (4.3). Furthermore, assume uniform stability (3.13), approximability (3.15b), limiting compactness (4.10), and that $I:V_{1}\to W$ is compact. If the exact solution $x$ satisfies (4.9), we have

[TABLE]

where

[TABLE]

Proof.

As in the proof of Lemma 4.2, define $x_{h}^{*}\in V_{h}$ by

[TABLE]

We deduce

[TABLE]

by replacing $b$ with $b_{h}$ and $x_{h}$ with $x_{h}^{*}$ in Lemma 3.5 and using the limiting compactness (4.10) instead of the compactness of $C^{*}:V_{2}\to Q$ in the proof of Lemma 3.6. Next, proceeding as in the proof of Lemma 4.2, assumption (4.9) on the exact solution gives

[TABLE]

We therefore conclude by inserting the two preceding relationships into the triangle inequality $\left\|{x-\tilde{x}_{h}}\right\|\leq\left\|{x-x_{h}^{*}}\right\|+\left\|{x_{h}^{*}-\tilde{x}_{h}}\right\|.$ ∎

We turn to prove a quantitative quasi-best approximation result. To this end, we need to specify the qualitative assumptions (4.9) and (4.10) by quantitative ones. We shall assume that

[TABLE]

and that

[TABLE]

where $\delta>0$ is suitably chosen. Note that (4.13) reduces for $C_{h}=C$ to the part regarding $C$ in the quantitative counterpart (3.17b) of the qualitative compactness (3.15a).

Theorem 4.5 (Quantitative asymptotic quasi-best approximation with approximate control-action).

Let $x$ , $\tilde{x}_{h}$ , $h>0$ , and $\tilde{\kappa}$ be as in Lemma 4.4. In addition, assume uniform stability (3.13) and that there exists $\delta>0$ such that we have (3.17), where (4.13) replaces the assumption on $C$ in (3.17b). If the exact solution $x$ satisfies also (4.12) with the same $\delta$ , we have

[TABLE]

Proof.

We follow the lines of the proof of Lemma 4.4, but replacing (4.9) with (4.12) and (4.11) with a quantitative argument in the spirit of Theorem 3.7. To this end, it suffices to use (4.13) instead of (3.17b). ∎

We conclude this section by assessing the key assumptions (4.9) and (4.12) by a remark and an example.

*Remark 4.6** (Ensuring dominated consistency error).*

As

[TABLE]

for

[TABLE]

we may verify assumptions (4.9) and (4.12) using relationships for $\|C_{h}C_{h}^{*}-CC^{*}\|_{L(V_{2},V_{2}^{*})}$ .

*Example 4.7** (Simple model optimization and piecewise constant controls).*

Consider the setting of Example 3.8, but with problem (1.1) with linear finite elements for the constraint and piecewise constants for the control variable. In the light of Example 4.1, this full discretization can be cast into (4.3) with $C_{h}=P_{h}C$ , where $P_{h}$ is the $L^{2}$ -projection onto piecewise constants. By duality, we have

[TABLE]

where $c_{1}$ depends on the shape regularity of the underlying meshes. Suppose that there is a constant $c_{2}$ such that

[TABLE]

This holds for example if the matrix norm of the Hessian of the exact state or its adjoint state are bounded away from 0 in a fixed subdomain. We conclude

[TABLE]

i.e., (4.12) with $\delta=1$ and a constant depending on the exact solution under consideration.

5. Analysis with Control Constraints

This section generalize our approach to optimization problems that are nonlinear because of constraints on the control.

5.1. Control constraints and discretization

Let $K\subset Q$ be the set of admissible controls. We assume that

[TABLE]

and denote by $\Pi_{K}:Q\to K$ the projection operator onto $K$ which is characterized by $\left\|{q-\Pi_{K}q}\right\|_{Q}=\inf_{p\in K}\left\|{q-p}\right\|_{Q}$ or, equivalently, by

[TABLE]

The latter characterization implies

[TABLE]

for all $q,p\in Q$ , which in turn shows that the operator $\Pi_{K}$ is strongly monotone and Lipschitz continuous, in both cases with constant 1.

The generalization of problem (2.3) incorporating convex control constraints is then the convex optimization problem

[TABLE]

Thanks to (5.1), a solution $(q,u)$ is characterized by the existence of $z\in V$ such that the following counterpart of the rescaled optimality system (2.7) is satisfied:

[TABLE]

As in Section 2, we insert the third equation into the first one and consider the corresponding weak formulation of the rescaled and reduced optimality system:

[TABLE]

where $b_{K}:=a+c_{K,\alpha}$ and

[TABLE]

which already incorporates the $1/\sqrt{\alpha}$ -scaling. In contrast to the previous sections, $c_{K,\alpha}$ and so $b_{K}$ are in general not linear in the first argument. Nonetheless, if we introduce the pseudometric

[TABLE]

inequality (5.2) leads to the following replacement of the properties (2.14) of the bilinear form $c$ : if $v,w\in V$ and $\varphi=\big{(}{-}(v_{1}-w_{1}),v_{2}-w_{2}\big{)}$ , then

[TABLE]

In addition, we have, for $v,w\in V$ ,

[TABLE]

The continuity bound (5.6b) leads to

[TABLE]

with the metric

[TABLE]

Notice that the role of the two arguments of $c$ and $b_{K}$ cannot be interchanged. We adapt (2.22) to this new situation in the following way: given $v,w\in V$ , we choose $\varphi=T_{K}(v-w)$ , where $T_{K}:V\to V$ is the linear operator given by

[TABLE]

$\gamma$ as in (2.23b), and $J_{i}:V_{i}\to V_{i}^{*}$ is the Riesz map for $V_{i}$ , $i=1,2$ . In view of (2.24), we thus obtain the following counterpart of Theorem 2.1.

Theorem 5.1 (Properties of form $b_{K}$ ).

If we equip $V$ as trial space with $d_{K,\alpha}$ and as test space with $\left\|{\cdot}\right\|$ , then we have, for any $v,w,\varphi\in V$ ,

[TABLE]

and

[TABLE]

where $\kappa$ is defined by (2.23).

Also here, we can conclude existence and uniqueness as a side-product.

Corollary 5.2 (Well-posedness with control constraints).

The optimization problem (5.5) has a unique solution.

Proof.

We shall apply the Zarantonello’s theorem of strongly monotone operators [27, Theorem 25.B] in the Hilbert space $(V,\left\|{\cdot}\right\|)$ . To prepare this, we first observe that

[TABLE]

Indeed, it is continuous with constant $1+\gamma$ owing to (2.22b) and boundedly invertible on account of the consequence

[TABLE]

of (2.19) and (2.24) for the bilinear form $b$ . Let us consider the nonlinear operator $\widetilde{B}_{K}:V\to V^{*}$ defined by

[TABLE]

where $\langle\cdot,\cdot\rangle$ denotes the duality pairing associated with $(V,\left\|{\cdot}\right\|)$ . Making use of Theorem 5.1, (2.19) and (5.7), we see that, for all $v,w\in V$ ,

[TABLE]

and

[TABLE]

Hence, $\widetilde{B}_{K}$ is strongly monotone and Lipschitz continuous and therefore boundedly invertible by [27, Theorem 25.B]. In light of (5.10), we can conclude by noting $T_{K}^{-*}\widetilde{B}_{K}v=b_{K}(v,\cdot)$ for all $v\in V$ . ∎

In order to discretize the optimization problem (5.3) with control constraints, we proceed as in Section 3.1. Introducing the discrete space $V_{h}=V_{h,1}\times V_{h,2}$ as therein, the variational discretization can be characterized as follows:

[TABLE]

Here we need that $\Pi_{K}(-C^{*}v_{h,2}/\sqrt{\alpha})$ can be evaluated exactly for $v_{h,2}\in V_{h,2}$ . This occurs, for example, when we consider (1.1) with box constraints and discretize with linear finite elements. If $\Pi_{K}$ has to be approximated, the subsequent error analysis involves additional technicalities, similar to those addressed in Section 4.

Existence and uniqueness of solutions to (5.11) can be established in a similar way as Corollary 5.2. Using (3.6) as in norm in $V_{2}$ , the major change is to replace the operator (5.9) by $T_{K,h}:V_{h}\to V_{h}$ given by

[TABLE]

where $A_{h}v_{h,1}:=a(v_{h,1},\cdot)_{|V_{h,2}}$ , $v_{h,1}\in V_{h,1}$ , is the discrete counterpart of $A$ , $1/\mu_{h}$ is its inf-sup-constant, $\gamma$ is as in (2.23), and $J_{h,i}:V_{h,i}\to V_{h,i}^{*}$ is the Riesz map for $V_{h,i}$ , $i=1,2$ .

5.2. Quasi-best approximation

We analyze the quasi-best-approximation properties of the nonlinear variational discretization (5.11), adopting again

[TABLE]

The following non-asymptotic result draws heavily on Theorem 5.1, which needed an $\alpha$ -dependent error notion for $V$ as trial space.

Theorem 5.3 (Non-asymptotic quasi-best approximation with control constraints).

If $x_{h}$ is the approximation given by (5.11) to an arbitrary solution $x$ of (5.5), then its error is quasi-best in $V_{h}$ in that

[TABLE]

where $\kappa_{h}$ and $\mu_{h}$ are as in Theorem 3.2.

Proof.

Given any $v_{h}\in V_{h}$ , we first write

[TABLE]

To bound the second term, we employ Theorem 5.1 with, respectively, $V_{h}$ , $T_{K,h}$ , $1/\mu_{h}$ , $1$ , and $\kappa_{h}$ in place of $V$ , $T_{K}$ , $m_{a}$ , $M_{a}$ , and $\kappa$ . Writing $\varphi_{h}=T_{K,h}(v_{h}-x_{h})$ , the definitions of $x$ and $x_{h}$ thus yield,

[TABLE]

and the claimed inequality is established as $T_{K,h}$ is invertible. ∎

The “ $+1$ ” in the bound for the quasi-best-approximation constant in Theorem 5.3 arises from the triangle inequality (5.13), which is avoided in deriving in (3.5). Yet, the following asymptotic quasi-best approximation results involving the generalized Ritz projection from (3.10) are not affected by such an augmentation.

Lemma 5.4 (Nonlinear variational and generalized Ritz approximations).

Let $x$ and $x_{h}$ be as in Theorem 5.3. The generalized Ritz projection $R_{h}x$ of $x$ and $x_{h}$ are related by

[TABLE]

where $\kappa_{h}$ and $\mu_{h}$ are as in Theorem 3.2.

Proof.

Applying Theorem 5.1 with the setting as in Theorem 5.3, writing $\varphi_{h}=T_{K,h}(x_{h}-R_{h}x)$ , and recalling (5.7), we derive

[TABLE]

and, again thanks to the invertibility of $T_{K,h}$ , the proof is finished. ∎

Let us sharpen Lemma 5.4 with the help of the additional assumptions and arguments from Section 3.3 regarding the linear optimality system.

Theorem 5.5 (Supercloseness to the generalized Ritz approximation).

Let $x$ , $x_{h}$ , and $R_{h}x$ be as in Lemma 5.4. Moreover, assume (3.13) and define $\bar{\kappa}$ as in Lemma 3.6. If (3.15) holds, then

[TABLE]

More specifically, if (3.17) holds, then

[TABLE]

For the $\alpha$ -dependence of $\bar{\kappa}$ , cf. Remark 2.4.

Proof.

In view of Lemma 5.4, it suffices to show $\left|{x-R_{h}x}\right|=o(\left\|{x-R_{h}x}\right\|)$ . To this end, we modify the argument in Lemma 3.6 slightly; a similar argument has been used by [10] under weaker assumptions on $(V_{h})_{h}$ . Let $(h_{k})_{k}$ be any sequence with $\lim_{k\to\infty}h_{k}=0$ and, writing $k$ whenever $h_{k}$ is an index, consider

[TABLE]

The sequence $(d_{k})_{k}$ is bounded in the Hilbert space $V$ by definition. For its weak limit $d\in V$ , we have

[TABLE]

for arbitrary $\varphi\in V$ and $\varphi_{k}\in V_{h}$ . Consequently, (3.15b), $k\to\infty$ , and (2.17) yield $d=0$ . In view of (3.15a), $d_{k}\to 0$ weakly in $V$ then implies $\left|{d_{k}}\right|\to 0$ .

For the second statement, we just note that the main step of the proof of Theorem 3.7 with $v=x-R_{h}x$ leads to $\left|{v-R_{h}v}\right|=O(h^{\delta}\left\|{x-R_{h}x}\right\|)$ . ∎

In view of the inverse triangle inequality

[TABLE]

Theorem 5.5 readily yields the following asymptotic quasi-best approximation result.

Corollary 5.6 (Asymptotic quasi-best approximation with control constraints).

Let $\nu_{K,h}$ be the quasi-best-approximation constant for the nonlinear variational discretization (5.11) with respect to $\left\|{\cdot}\right\|$ . Moreover, assume (3.13) and define $\bar{\kappa}$ as in Lemma 3.6. If (3.15) holds, then

[TABLE]

More specifically, if (3.17) holds, then

[TABLE]

For the $\alpha$ -dependence of $\bar{\kappa}$ , cf. Remark 2.4.

In comparison with Lemma 3.6 and Theorem 3.7, Corollary 5.6 features an additional $M/\sqrt{\alpha}$ -factor. This factor stems from the fact that the derivation we went through used an error notion that also incorporates it.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] I. Babuška , Error-bounds for finite element method , Numer. Math., 16 (1971), pp. 322–333.
2[2] E. Casas and M. Mateos , Uniform convergence of the FEM. Applications to state constrained control problems , Comput. Appl. Math., 21 (2002), pp. 67–100.
3[3] E. Casas and F. Tröltzsch , Error estimates for linear-quadratic elliptic control problems , in Analysis and Optimization of Differential Systems (Constanta, 2002), Kluwer Acad. Publ., Boston, MA, 2003, pp. 89–100.
4[4] K. Chrysafinos and E. N. Karatzas , Symmetric error estimates for discontinuous Galerkin approximations for an optimal control problem associated to semilinear parabolic PDE’s , Mar. 2012.
5[5] K. Chrysafinos and E. N. Karatzas , Symmetric error estimates for discontinuous Galerkin time-stepping schemes for optimal control problems constrained to evolutionary Stokes equations , Comput. Optim. Appl., 60 (2015), pp. 719–751.
6[6] K. Deckelnick, A. Günther, and M. Hinze , Finite element approximation of elliptic control problems with constraints on the gradient , Numer. Math., 111 (2009), pp. 335–350.
7[7] K. Deckelnick and M. Hinze , Convergence of a finite element approximation to a state-constrained elliptic control problem , SIAM J. Numer. Anal., 45 (2007), pp. 1937–1953.
8[8] , Numerical analysis of a control and state constrained elliptic control problem with piecewise constant control approximations , in Numerical Mathematics and Advanced Applications, 2008, pp. 597–604.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Quasi-best approximation

Abstract.

1. Introduction

2. Model optimization problem and reduced optimality system

Theorem 2.1** (Bilinear form of reduced optimality system).**

Remark 2.2* (Amplification for pure constraint case).*

Remark 2.3* (Amplification for degenerating constraint).*

Remark 2.4* (Amplification for vanishing regularization).*

Remark 2.5* (Coercive constraints with C∗=IC^{*}=IC∗=I).*

Corollary 2.6** (Existence and uniqueness).**

Proof.

3. Analysis for variational discretization

3.1. Variational discretization and reduced optimality system

Lemma 3.1** (Discrete well-posedness).**

3.2. Non-asymptotic quasi-best approximation

Theorem 3.2** (Non-asymptotic quasi-best approximation).**

Proof.

Remark 3.3* (Quasi-best approximation for coercive constraints and C∗=IC^{*}=IC∗=I).*

Remark 3.4* (Vanishing regularization and quasi-best approximation).*

3.3. Asymptotic quasi-best approximation

Lemma 3.5** (Quasi-best-approximation constants).**

Proof.

Lemma 3.6** (Qualitative asymptotic quasi-best approximation).**

Proof.

Theorem 3.7** (Quantitative asymptotic best approximation).**

Proof.

Example 3.8* (Simple model optimization).*

Example 3.9* (Point source control).*

4. Analysis with approximate control-action operator

4.1. Approximate variational discretization

Example 4.1* (Discretized controls).*

4.2. Approximation

Lemma 4.2** (Approximation and consistency error).**

Proof.

Corollary 4.3** (Necessary condition for quasi-best approximation).**

Proof.

Lemma 4.4** (Qualitative asymptotic quasi-best approximation with approximate control-action).**

Proof.

Theorem 4.5** (Quantitative asymptotic quasi-best approximation with approximate control-action).**

Proof.

Remark 4.6* (Ensuring dominated consistency error).*

Example 4.7* (Simple model optimization and piecewise constant controls).*

5. Analysis with Control Constraints

5.1. Control constraints and discretization

Theorem 5.1** (Properties of form bKb_{K}bK​).**

Corollary 5.2** (Well-posedness with control constraints).**

Proof.

5.2. Quasi-best approximation

Theorem 5.3** (Non-asymptotic quasi-best approximation with control constraints).**

Proof.

Lemma 5.4** (Nonlinear variational and generalized Ritz approximations).**

Proof.

Theorem 5.5** (Supercloseness to the generalized Ritz approximation).**

Proof.

Corollary 5.6** (Asymptotic quasi-best approximation with control constraints).**

Theorem 2.1 (Bilinear form of reduced optimality system).

*Remark 2.2** (Amplification for pure constraint case).*

*Remark 2.3** (Amplification for degenerating constraint).*

*Remark 2.4** (Amplification for vanishing regularization).*

*Remark 2.5** (Coercive constraints with $C^{}=I$ ).

Corollary 2.6 (Existence and uniqueness).

Lemma 3.1 (Discrete well-posedness).

Theorem 3.2 (Non-asymptotic quasi-best approximation).

*Remark 3.3** (Quasi-best approximation for coercive constraints and $C^{}=I$ ).

*Remark 3.4** (Vanishing regularization and quasi-best approximation).*

Lemma 3.5 (Quasi-best-approximation constants).

Lemma 3.6 (Qualitative asymptotic quasi-best approximation).

Theorem 3.7 (Quantitative asymptotic best approximation).

*Example 3.8** (Simple model optimization).*

*Example 3.9** (Point source control).*

*Example 4.1** (Discretized controls).*

Lemma 4.2 (Approximation and consistency error).

Corollary 4.3 (Necessary condition for quasi-best approximation).

Lemma 4.4 (Qualitative asymptotic quasi-best approximation with approximate control-action).

Theorem 4.5 (Quantitative asymptotic quasi-best approximation with approximate control-action).

*Remark 4.6** (Ensuring dominated consistency error).*

*Example 4.7** (Simple model optimization and piecewise constant controls).*

Theorem 5.1 (Properties of form $b_{K}$ ).

Corollary 5.2 (Well-posedness with control constraints).

Theorem 5.3 (Non-asymptotic quasi-best approximation with control constraints).

Lemma 5.4 (Nonlinear variational and generalized Ritz approximations).

Theorem 5.5 (Supercloseness to the generalized Ritz approximation).

Corollary 5.6 (Asymptotic quasi-best approximation with control constraints).