TL;DR
This paper develops a framework for solving optimal control problems involving elliptic equations with positive measure controls, establishing existence, deriving optimality conditions, and proposing a numerical solution method.
Contribution
It introduces a novel approach to ensure existence of solutions using Radon measures and Fenchel duality, and presents a discretization and semismooth Newton method for computation.
Findings
Existence of optimal controls in Radon measure space under certain constraints.
Derivation of optimality conditions via Fenchel duality.
Numerical method combining discretization and semismooth Newton algorithm.
Abstract
Optimal control problems without control costs in general do not possess solutions due to the lack of coercivity. However, unilateral constraints together with the assumption of existence of strictly positive solutions of a pre-adjoint state equation, are sufficient to obtain existence of optimal solutions in the space of Radon measures. Optimality conditions for these generalized minimizers can be obtained using Fenchel duality, which requires a non-standard perturbation approach if the control-to-observation mapping is not continuous (e.g., for Neumann boundary control in three dimensions). Combining a conforming discretization of the measure space with a semismooth Newton method allows the numerical solution of the optimal control problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Optimal control of elliptic equations with positive measures
Christian Clason Faculty of Mathematics, University Duisburg-Essen, 45117 Essen, Germany () [email protected]
Anton Schiela Institute of Mathematics, University of Bayreuth, 95440 Bayreuth, Germany () [email protected]
(September 7, 2015)
Abstract
Optimal control problems without control costs in general do not possess solutions due to the lack of coercivity. However, unilateral constraints together with the assumption of existence of strictly positive solutions of a pre-adjoint state equation, are sufficient to obtain existence of optimal solutions in the space of Radon measures. Optimality conditions for these generalized minimizers can be obtained using Fenchel duality, which requires a non-standard perturbation approach if the control-to-observation mapping is not continuous (e.g., for Neumann boundary control in three dimensions). Combining a conforming discretization of the measure space with a semismooth Newton method allows the numerical solution of the optimal control problem.
1 Introduction
This work is concerned with the following optimal control problem, stated formally as
[TABLE]
where is a second-order elliptic differential operator and is a given target. Furthermore, is the observation domain with corresponding restriction operator , and the control is defined on a control domain with corresponding extension operator . (This setting includes boundary control and observation; for details we refer to Section 2.)
Problem (1) differs from standard control-constrained optimal control problems by the fact that no control cost term, e.g., of the form or with and a suitable Banach space , appears in the functional. This term is usually necessary to guarantee existence of an optimal solution , since it provides us with coercivity of the objective functional in the appropriate topology. Consequently, one of the major issues in this work will be the discussion of existence of minimizers of this problem. As we will show, the non-negativity together with the tracking term is sufficient (under an appropriate assumption on the operator ) to obtain coercivity with respect to , albeit only in the space of measures. Intuitively, boundedness of in implies boundedness of only in , which is all one can expect in general without control constraints. It is thus surprising that in many cases optimal controls exist in the more regular space of Radon measures if merely unilateral constraints are present, thus allowing to formulate, analyze and numerically solve the limit problem as in the above-mentioned standard problems with unilateral constraints, which is the main motivation of this work.
Once existence of optimal controls is established, first-order optimality conditions can be derived via Fenchel duality. This is relatively straightforward in those cases where the control-to-observation mapping is continuous as a mapping . However, due to the low regularity of the control, this assumption is not satisfied for all relevant applications (e.g., Neumann-control in three dimensions; similar difficulties are to expected for parabolic problems). These cases require special care since they involve unbounded operators. A second motivation of this work is therefore to extend the Fenchel duality theorem to this setting.
Let us remark on some related problems. Recently, a class of elliptic problems came into the focus of interest, where control costs of the form were used and which possess generalized solutions ; see [Clason:2010a, Clason:2011a, Clason:2012, Casas:2013]. In particular, we rely on the first three works for the numerical computation of our optimal measure space controls using a semismooth Newton method and a conforming finite element discretization of . Often such functionals are still augmented by an additional -type control cost as well as bilateral control constraints, and the limit is considered; see, e.g., [Stadler:2007a, Wachsmuth:2009]. A second related problem class is that of so-called bang-bang-problems [Hinze:2012], where no control costs are present, but the control constraints are bilateral, so that optimal solutions exist in . Finally, due to the presence of measure-valued controls, we will have to define the operator in a way that has a unique solution for each . This requires an extension of the usual variational setting in . In this respect, our paper draws from results in the literature; see [Schiela:2010] and the references therein. It also provides a link to the study of state-constrained problems [Casas85], where measure-valued right-hand sides appear in first-order optimality conditions.
This work is organized as follows. Section 2 discusses well-posedness of the state equation for measure-valued right-hand sides. In Section 3, we give a rigorous statement of Problem (1) and show that under a strict positivity assumption on the adjoint control-to-observation mapping, a minimizer to (1) exists in the space of Radon measures; we discuss the validity of this assumption in the context of second-order elliptic equations in Section 3.1. Section 3.2 gives some examples as well as a counterexample that shows the necessity of our assumption. Optimality conditions for these minimizers are derived in Section 4 based on a Fenchel duality theorem for an unbounded operator. In Section 5, we remark on the relation of Problem (1) to the corresponding problems including additional or measure-space control costs. The numerical solution based on a variational discretization and a semismooth Newton method is discussed in Section 6. Finally, numerical examples are presented in Section 7.
2 State equation
We first discuss well-posedness of the control-to-observation mapping . Since is only a Radon measure and need not be continuous, this requires some technicalities. In particular, due to the presence of the non-reflexive spaces and it will be useful to start with defining the pre-adjoint operators of and .
Elliptic differential operator
Consider a bounded domain (i.e., an open connected subset) with Lipschitz boundary , so that the trace operator is well-defined. Let be a continuous and elliptic bilinear form, defined by
[TABLE]
where subsequently we assume that the coefficients are symmetric (i.e., ) and bounded on , and that and are non-negative bounded functions in and , respectively. Furthermore, assume that there exists such that
[TABLE]
We assume further that not both and are identically [math]. As usual, it follows by the Poincaré inequality that is coercive, i.e., there exists such that
[TABLE]
Alternatively, we could impose Dirichlet boundary conditions on (part of) to obtain coercivity. However, in the following discussion we stick to the case , mainly for simplicity of presentation.
It then follows from the Lax–Milgram theorem that for each , there is a unique , such that for all . In this way, the well-known isomorphism is constructed via .
Extension to measure-valued right-hand sides
Our next aim is to define a version of this operator that covers elliptic PDEs with measure-valued right-hand sides. For , this does not fit into the classical variational framework. Following the method of Stampacchia [Stampacchia:1965a], we will therefore first construct an unbounded pre-dual operator with domain , and then consider its adjoint whose co-domain is then – by definition – the dual of , which can be identified by the Riesz representation theorem with the space of Radon measures . The following construction is similar to the one given in [Schiela:2010]; our main reference concerning unbounded operators is [Goldberg:2006].
Consider an index (the spatial dimension), so that , and its dual index which satisfies . By Hölder’s inequality applied to the derivatives, is still well-defined and continuous as a bilinear form
[TABLE]
Let us define a domain (often called “maximal domain of definition”) and a bijective mapping in the following way:
[TABLE]
Let us stress that here (and in similar occasions) the bound may depend on but not on .
By (5), we conclude that , and under relatively mild assumptions on the smoothness of the coefficients and on the domain, regularity theory even yields if is sufficiently close to ; see, e.g., [Troianiello:1987a, Theorem 3.16]. This is called the case of “maximal regularity”. In fact, for , it is always possible to find an appropriate . In this case we can define as follows:
[TABLE]
Otherwise, if is a proper superset of , the bilinear form is not defined anymore for all and due to lack of integrability of the principal part. However, by the definition of in (6), we can extend to a bilinear form via the unique continuous extension
[TABLE]
where is a sequence in such that in . By density of in , such a sequence always exists, and by definition of in (6), the limit of always exists and depends only on the limit .
Under very mild assumptions, it is still possible to show (see, e.g., [Rehberg:2009, Theorem 3.3, Corollary 3.5, Corollary 3.6]), so that we obtain:
[TABLE]
In both cases is a bijective, closed, unbounded operator (cf. [Schiela:2010]) and thus has continuous inverse by the open mapping theorem for closed operators; see, e.g., [Goldberg:2006, II.1.8]. In what follows only this – more general – setting is required, keeping in mind, however, that (and thus also its adjoint, defined next) corresponds to , which only coincides with if , cf. [Schiela:2010].
Since is dense in , the Banach space adjoint (also called conjugate) of is well-defined as a linear operator (cf., e.g., [Goldberg:2006, Def. II.2.2])
[TABLE]
where is canonically defined as
[TABLE]
Then for any , the mapping defines a continuous linear functional on the dense subspace . It can thus be extended uniquely to a continuous functional on satisfying for all . By the Riesz representation theorem, can be identified with an element of . We stress that this is the standard construction of the Banach space adjoint of an unbounded, densely defined operator. By [Goldberg:2006, Theorem II.2.6, Theorem II.4.4], the operator is also closed and continuously invertible, because is.
We even obtain the following compactness property:
Lemma 2.1** ([Schiela:2010, Lemma 2.15]).**
Consider a sequence that converges weakly- in to . Then the sequence converges strongly in to .
Control operator
Next, consider a compact set such that there exists a continuous trace or embedding operator . Here is defined with respect to an appropriate positive and bounded measure on ; e.g., with the Lebesgue measure for distributed control, and with the boundary measure for boundary control. Technically, we will require in the following that for any open subset such that is non-empty. This guarantees applicability of LABEL:thm:wsd (see LABEL:sec:appendix).
We introduce the linear and continuous restriction operator
[TABLE]
which coincides with the above mentioned restriction operator on , this space being dense in both and .
Its adjoint can be interpreted (via the Riesz representation theorem) as a mapping
[TABLE]
acting as the extension by [math] of a measure on to a measure on . On it coincides with the operator . Moreover, by LABEL:thm:wsd the space is weakly- sequentially dense in .
Observation operator
For the operator , which will be defined on reflexive spaces, it is most convenient to start with the primal operator. Let , equipped with a suitable measure, and assume that there exists a closed (possibly unbounded) operator
[TABLE]
where is dense in . By this assumption, the restriction of to , i.e.,
[TABLE]
is defined on all of . It is readily verified that is closed as well. Thus, by the closed graph theorem (see, e.g., [Goldberg:2006, II.1.9]), is even a continuous operator.
In many cases is continuous for suitable , and holds, but there are also important cases where lacks continuity. Typical examples (e.g., embedding or trace operators) are discussed in detail below.
By reflexivity, we can define its adjoint as a closed operator
[TABLE]
since in this case . Like all adjoints of closed operators in reflexive spaces, has a dense domain; see, e.g., [Goldberg:2006, Theorem II.2.14]. Comparison with yields that for every for which the latter is defined, i.e., for . Thus, the continuous operator can be considered as the unique continuous extension of after the co-domain space has been extended from to (and renormed).
Control-to-observation mapping
Finally, we define
[TABLE]
where is dense in by our above assumptions. This mapping is well-defined, since is a continuous operator, defined on all of . Since the adjoint of a densely defined (unbounded) linear operator is closed, see, e.g., [Goldberg:2006, Theorem II.2.6], is a closed operator
[TABLE]
Since may be unbounded, the following assertion is not obvious.
Lemma 2.2**.**
It holds that
[TABLE]
and . Furthermore, is weakly- closed, i.e., if in and in with , then .
Proof 2.3**.**
By purely algebraic arguments we have for that since then both sides of the equality are well-defined. Thus, we have to prove the equality of their domains, using the definition of in (19). By continuity of we conclude
[TABLE]
By definition of domains of adjoints, iff , and iff . By (20), , and hence the domains coincide.
The last inclusion in (19) follows from the fact that for , we have . This in turn is a consequence of , so that coincides with the variational solution of the state equation.
By Lemma 2.1, weak- convergence of implies strong convergence of in . Since is closed, it is also weakly closed (since its graph is a convex closed set, thus weakly closed). Hence, and with imply .
We remark for later reference that by definition of adjoints, we have that
[TABLE]
where here and in the following, we have omitted the domains from the spaces appearing in duality pairings if they are clear from the context. Also, by definition of , for there exists a bounded sequence in such that .
Finally, we remark that is weak- sequentially dense in . This follows via , using LABEL:thm:wsd, which states that is weakly- sequentially dense in . In particular, for all implies for all and thus as an element of .
Using and , we complement the measure-space operators and by their “standard” counterparts, i.e., the continuous mappings
[TABLE]
The operator is a restriction of and coincides with it on . In contrast, is an extension of and is defined on all of and not only on . This is possible because has a larger co-domain .
3 Existence of minimizers
Using the control-to-observation operator, we can state Problem (1) in reduced form as
[TABLE]
where denotes the indicator function of the positive cone in , i.e.,
[TABLE]
We now address existence of minimizers to (P), which requires an assumption on the control-to-observation operator which we call a pre-dual Slater condition. Since this operator is defined via duality, it will be seen that it is natural to formulate this assumption in terms of the pre-adjoint .
Assumption 3.1** (Pre-dual Slater condition).**
There exists a function such that is strictly positive, i.e., there is such that
[TABLE]
Since , 3.1 claims the existence of a function such that the solution of the equation is a continuous function and satisfies . We are thus looking for solutions of elliptic equations that are strictly positive (on parts of the domain).
Using this assumption, we can show that a minimizing sequence is bounded in a sufficiently strong topology.
Lemma 3.2**.**
If 3.1 holds, then any minimizing sequence for (P) is bounded in with bounded in .
Proof 3.3**.**
First, note that the non-negativity constraint and coercivity of the tracking term imply, respectively, that for all and that is bounded in (and in particular, that ). Using 3.1 and identifying with the constant function , we thus deduce from the definition of the total variation norm of a non-negative measure that
[TABLE]
and hence the claimed boundedness follows.
With this, we obtain existence of a minimizer by Tonelli’s direct method.
Theorem 3.4**.**
Under the above assumptions, there exists a minimizer of (P) such that . If is injective, is unique.
Proof 3.5**.**
Let be a minimizing sequence for (P), which is bounded in by Lemma 3.2. Since is separable, the Banach–Alaoglu theorem yields existence of a subsequence converging weakly- to some . By boundedness of , we may then extract another subsequence such that converges weakly to some . By Lemma 2.2 we obtain . From weak- sequential closedness of the non-negative cone in , we deduce that is feasible and thus a minimizer of (P). Finally, strict convexity of the tracking term implies that any pair of minimizers satisfies and hence, if is injective, .
3.1 Verification of the pre-dual Slater condition
We now discuss situations in which 3.1 can be verified. Recall that we have to show for some the existence of a solution to the equation
[TABLE]
such that is strictly positive on . Although it is well-known that elliptic PDEs have non-negative solutions for non-negative right-hand sides and boundary data, existence of a strictly positive solution is not a trivial matter and of course not satisfied in general (consider the homogenous Dirichlet problem and ). Moreover, the literature – although quite exhaustive for the Dirichlet problem – is much scarcer in the case of Neumann, Robin or even mixed boundary conditions.
We first remark that under the stated assumptions, given by (2) is uniformly elliptic and hence defines a positive operator, i.e., for all ,
[TABLE]
This already implies strict positivity on compact subsets of .
Lemma 3.6**.**
Let be a domain. Assume that satisfies and
[TABLE]
If is compact, there is a such that on , and in particular, on .
Note the discrepancy between and ; we choose this setting because it fits to the setting in [GilTru1977, Chapter 8], from which we cite a crucial result: the Harnack inequality. Unfortunately, a Harnack inequality for the setting (covering Robin, Neumann, or mixed boundary conditions explicitly) is hard to find in the literature.
Proof 3.7**.**
The result is a consequence of the weak Harnack inequality (cf. [GilTru1977, Theorem 8.18]), which holds for non-negative supersolutions of . Let be given and denote by a ball around of radius . If , then there exists a such that
[TABLE]
With this result, we will show that either or on for any supersolution . Since is a domain, and thus open and connected, we merely have to assert that is open and closed, because then either (i.e., ) or (i.e. ). Indeed, by continuity of , is (relatively) closed in and by (29), every is contained in a ball as long as . Hence, is open. Thus, if on , we have and so on .
Finally, if is compact, then has a minimizer on , i.e., for all .
In what follows we denote , where the first factor is equipped with the Lebesgue measure, and the second with the boundary measure; we denote the corresponding product measure by . If is any subset of , the space is taken relatively to .
Lemma 3.6 already yields a first result. In the following, denotes the characteristic function of , which is identically on and [math] on .
Corollary 3.8**.**
If is a compact subset of and has positive measure (i.e., ), then 3.1 is satisfied.
Proof 3.9**.**
Set in (24). Since , we have and thus . Hence, Lemma 3.6 can be applied and yields the desired result.
Next, we want to cover the general case .
Lemma 3.10**.**
Assume that satisfies as well as
[TABLE]
and assume moreover that there is such that for it holds that
[TABLE]
Then .
Proof 3.11**.**
We insert , which is in , into (2) and show that and thus . Observe that implies and that implies and for . With this we compute:
[TABLE]
and obtain
[TABLE]
Since implies that , the last two integrals vanish by our assumption on and . Moreover, since and , the first two integrals are non-positive (recall that ). It follows that , implying .
From this we can deduce the following sufficient criterion for the pre-dual Slater condition.
Proposition 3.12**.**
If on , then 3.1 is fulfilled for any compact .
Proof 3.13**.**
We show that the solution of (30) is strictly positive. By Lemma 3.6, we already know that on . For , let . Note that as since on .
Define like but with replaced by , and as the solution of
[TABLE]
Then and
[TABLE]
Hence, , and thus implies that and thus . Hence, Lemma 3.10 yields (after choosing ) that .
Furthermore,
[TABLE]
and for any ,
[TABLE]
so that by [Stampacchia:1965a, Théorème 4.1], there exists a such that for any ,
[TABLE]
Since for , we can choose sufficiently small such that for adequately chosen , we have
[TABLE]
Hence, we can estimate
[TABLE]
i.e., . We conclude that , and therefore
[TABLE]
as claimed.
3.2 Examples
To illuminate our abstract framework further, let us discuss in the following a couple of examples. All of them have in common the generic definition of
[TABLE]
where is chosen appropriately as stated in the beginning of Section 2. However, the examples will cover different definitions of and and the corresponding spaces, i.e., different types of control and observation.
Distributed control for a Neumann problem
As a first example, consider a homogeneous Neumann problem with distributed control (i.e., and ), such that
[TABLE]
is the control operator with pre-adjoint .
Let us first consider boundary observation, i.e., . We start with recalling that there exists a continuous trace operator
[TABLE]
for suitably chosen depending on and the spatial dimension of . In particular, for we may always choose . In the general case, we may define
[TABLE]
(which implies if ), and then
[TABLE]
as the restriction of to . Since the norm of the co-domain space has been strengthened, is in general not continuous anymore. It is, however, a closed operator: Assume that in and in . By continuity of , we conclude that in ; but from in we deduce that and thus and .
We summarize that satisfies all our assumptions, and note that for we may choose sufficiently close to such that is well-defined as a continuous operator. However, the same is impossible for , so that we have to work with unbounded in this case.
For the case of observation on the whole domain (i.e., ) and , we may simply define as the Sobolev embedding which exists for suitably chosen . In the “exotic” case , a similar effect as for boundary control with appears, and has to be defined as an unbounded operator.
By Proposition 3.12 and by our assumption , we see that we can choose arbitrarily as long as it has positive measure with respect to the measure on .
Robin or Neumann boundary control
In this case, our control operator is defined as the extension by zero
[TABLE]
i.e., denotes the trace operator from to . Again, we take as the identity. To verify the pre-dual Slater condition, we then need to find , such that the solution of the problem
[TABLE]
has a strictly positive boundary trace, i.e., . According to Proposition 3.12 this can be achieved for Neumann boundary conditions if is arbitrary (of non-zero measure), and for Robin boundary conditions if .
Distributed control for a Dirichlet problem
We close this section with a simple example for which 3.1 is violated. Consider the problem
[TABLE]
Due to the homogemous Dirichlet boundary conditions and by continuity, there cannot be any solutions of the predual problem which are larger than some on the whole domain, which coincides with the control domain. So 3.1 is clearly violated.
To show that also the conclusions of Theorem 3.3 do not hold, let us take for the sequence of measures , which is contained in but unbounded.
Lemma 3.14**.**
The weak solution of is given by
[TABLE]
Proof 3.15**.**
We have to find such that for all and . By the Lax–Milgram theorem, we know that this solution is unique; moreover, the special form of the right-hand side leads us to the ansatz on and on . Using the homogenous boundary conditions, we find that on and on . Since has to be continuous at , we conclude that .
Then, we can obtain using the weak formulation and the fundamental theorem of calculus that
[TABLE]
which implies that . Solving these two equations for and yields our claim.
Proposition 3.16**.**
Problem (49) does not possess an optimal solution in .
Proof 3.17**.**
From Lemma 3.14 we conclude that in . Hence, is a minimizing sequence, since each pair is feasible and for all . However, the limit cannot be attained, because the only possible candidate does not satisfy the boundary conditions.
If we instead consider
[TABLE]
for some , then the control domain is a compact subset of . So by Lemma 3.6 we can verify 3.1 and thus apply Theorem 3.4 to assert existence of an optimal control in . This reasoning works in general for distributed control on a compact subset of the domain .
4 Optimality conditions
We apply Fenchel duality to derive optimality conditions for minimizers of (P). For the reader’s convenience, we recall duality theory, e.g., from [Ekeland:1999a, Chapter II.4]. For a functional defined on a Banach space , let denote the Fenchel conjugate of given for by
[TABLE]
Furthermore, let
[TABLE]
denote the subdifferential of the convex function at , which reduces to the Gâteaux-derivative if it exists. These definitions immediately yield the Fenchel–Young inequality
[TABLE]
where equality holds if and only if .
The Fenchel duality theorem states that if and are proper, convex, and lower semicontinuous functionals on the Banach spaces and , is a continuous linear operator, and there exists a such that , , and is continuous at (a generalized Slater condition), then
[TABLE]
and the right-hand side of (56) – the dual problem – has at least one solution. Furthermore, the equality in (56) is attained at if and only if
[TABLE]
holds; see, e.g., [Ekeland:1999a, Remark III.4.2].
We wish to apply the Fenchel duality theorem to (P), where would take the role of the control-to-observation mapping . Since is non-reflexive, the dual problem would be posed in , which is difficult to characterize. We therefore follow a pre-dual approach as in [Clason:2010a, Clason:2011a], where we introduce the optimization problem
[TABLE]
(obtained by formal application of Fenchel duality) and show that its Fenchel dual coincides with problem (P).
Remark 4.1**.**
Before delving into a deeper analysis, let us point out that the pre-dual problem (P) is essentially a state-constrained optimal control problem with control and state , i.e.,
[TABLE]
However, it has the slightly unusual characteristics that the state does not appear in the objective and that the inequality constraint is imposed on a subdomain.
A further complication arises if is a proper subset of . This case corresponds to a state-constrained problem where the control-to-state mapping does not map into the space of continuous functions. Such problems have been analysed in [Schiela:2009]. The analysis performed in this section may offer an alternative approach to this class of problems.
Problem (P) is strictly convex and admits a feasible point by 3.1 and thus is non-trivial, i.e., admits a finite infimum. If is not closed, we cannot expect (P) to have a minimizer. However, any minimizing sequence is bounded in and thus has a weak cluster point . In fact, by strict convexity of the term , any minimizing sequence converges even strongly to the unique limit . While is possibly not contained in – and hence is not defined – we can express the limit using a suitable extension of which we will define below.
Although the Fenchel duality theorem is not directly applicable since may be an unbounded operator, a modification of the arguments in [Ekeland:1999a] shows that the statement still holds. In our argumentation, we can make use of the fact that we have already established existence of solutions of the dual problem in Theorem 3.4. For the sake of completeness, we give here the full proof, where we closely follow [Ekeland:1999a, Chapter II.4]. Let us define for problem (P) the perturbation function by
[TABLE]
Clearly, is convex but – by the last term – not lower semicontinuous with respect to unless . Furthermore, coincides with (P) and hence is finite.
Consider now the Fenchel conjugate of with respect to .
Lemma 4.2**.**
The dual problem
[TABLE]
coincides with problem (P). Furthermore, if 3.1 is satisfied, the supremum is attained at .
Proof 4.3**.**
By definition, the Fenchel conjugate at is given by
[TABLE]
Using that is dense in and introducing for the function then yields for the case that :
[TABLE]
If, in contrast, , there exists a sequence , bounded in , such that . Hence the first term in the first line is unbounded, while the opthers are bounded, and thus . We therefore assume that and maximize separately with respect to and . Considering the first term, we have that for some implies that . Otherwise, the supremum is attained at and is [math]. For the second term, we use that the functional is differentiable with respect to to deduce that the supremum is attained at . Together, we obtain
[TABLE]
Writing , we see that the dual problem (60) is precisely our original problem (P), which by Theorem 3.4 has a solution .
To derive optimality conditions, we first show that the duality gap between (P) and (P) is zero.
Proposition 4.4**.**
We have that
[TABLE]
Proof 4.5**.**
The claim follows from [Ekeland:1999a, Proposition III.2.1] if Problem (P) is normal, i.e., the mapping is lower semicontinuous at [math]. To verify this, it suffices to show that for each feasible point , we can find a nearby feasible point with close to . This can be achieved by adding a small multiple of the function from 3.1, since is strictly positive and the perturbations are measured in the -norm.
Thus, for given we can find such that with , is feasible for the original problem, as long as is feasible for the perturbed problem. Moreover, it is easy to see that with as . Taking infima, this implies that
[TABLE]
which in turn yields the desired lower semicontinuity and thus (64).
To derive optimality conditions from the equality (64), we continue as in [Ekeland:1999a, § III, equation (4.22)]. We first derive a limiting form of the optimality conditions.
Proposition 4.6**.**
Let be a minimizing sequence for Problem (P) with , and let be the solution to Problem (60). Then,
[TABLE]
Proof 4.7**.**
By definition of , Proposition 4.4 implies that if is a minimizing sequence of and is a minimizer of , we have
[TABLE]
We now use continuity of with respect to (recall that this limit exists due to the strict convexity of the first term in (P)), which yields
[TABLE]
Next, we observe that, since and thus , we have the convergence
[TABLE]
Hence, continuing our last computation, we obtain
[TABLE]
We now argue that both brackets are non-negative. For the first bracket, we use the fact that the third term is the Fenchel conjugate of the sum of the first two terms to apply the Fenchel–Young inequality (55). For the second bracket, feasibility of elements of a minimizing sequence (after passing to a subsequence if necessary) implies that and and hence that the first two terms vanish. By definition of non-negativity of measures, positivity of and implies that for all and hence that the third term is non-negative as well. Therefore, each bracket has to vanish separately. The first one immediately yields equality in (55) and hence that
[TABLE]
i.e., the first relation of (66). From the second bracket, we directly obtain the remaining relations (i.e., the second line) of (66).
We now wish to pass to the limit in (66), which is impeded by the fact that the operators and are defined in the non-standard setting needed for measure-valued control. Recall that – which appears in – is a restriction of its classical counter-part . Hence, while may not be well-defined, is well-defined since . Moreover, from we can deduce not only that but also that .
We thus make use of to define a new bilinear form
[TABLE]
that can be used as a replacement of the term in (66) but is well-defined also for the limit . Let and with such that , then set
[TABLE]
With this definition, we obtain the following first-order necessary optimality conditions.
Theorem 4.8**.**
Let be a minimizer of Problem (1). Then there exist , and satisfying
[TABLE]
Proof 4.9**.**
First, we note that is well-defined because implies , and because . We now to argue that this bilinear form can indeed be used in (66). For , we have and thus
[TABLE]
Furthermore, if and the sequence converges to in , then
[TABLE]
Thus, the limit in (66) can be replaced by as claimed.
Introducing the state , an adjoint state and a Lagrangian multiplier now yields (OS).
If is continuous, we can directly pass to the limit in the second relation of (66) and obtain a Lagrange multiplier .
Corollary 4.10**.**
Assume that is continuous, and let be a minimizer of Problem (1). Then there exist , , and satisfying
[TABLE]
In this case, the optimality conditions can also be obtained by direct application of the Fenchel duality theorem to problem (P), where the last three relations of (76) are the complementarity conditions of the second relation of (57), which here read .
5 Connection to problems with control costs
In this section, we show that problem (P) can be interpreted as the limit problem for vanishing or measure-space control costs.
5.1 control costs
We first connect the measure-space problem (P) with the classical control-constrained linear quadratic problem
[TABLE]
which for every is known to admit a minimizer ; see, e.g., [TroBook, Theorem 2.14]. Arguing as in the proof of Theorem 3.4, it can be shown that converges weakly- to some in as (up to a subsequence if is not injective). It is, however, not obvious that the limit coincides with the global minimizer from Theorem 3.4. The validity of this assertion hinges on the question, whether there is a sequence such that and in , i.e., whether optimal control and optimal observation can be approximated simultaneously by a sequence of positive functions.
Due to LABEL:thm:wsd, this is certainly the case if is continuous, since then implies by Lemma 2.1.
Theorem 5.1**.**
Assume that is continuous, is injective, and is equipped with a measure such that for every open set , such that is non-empty. Then
[TABLE]
Proof 5.2**.**
By LABEL:thm:wsd, there exists a sequence such that . Since is continuous, this implies via Lemma 2.1 that strongly and thus that . Denoting by the functional in (Pα) and by the functional in (P), we conclude that for each there are and such that
[TABLE]
Hence, is a minimizing sequence for , which satisfies – like any minimizing sequence – the properties stated in the proof of Theorem 3.4. This yields our assertions.
On the other hand, if and thus is unbounded, the graph norm on , defined by , is strictly stronger than . Thus, there may be sequences in that converge weakly- in but are unbounded in and thus cannot converge weakly- with respect to this norm. Hence if is unbounded, the weak- sequential closure of may be a proper subset of , and thus we cannot expect in general that our global minimizer can be approximated by a minimizing sequence in .
Although the necessary optimality conditions for Problem (Pα) are standard (see, e.g., [TroBook, Theorem 2.22]), it is instructive to derive them using the convex analysis framework employed for (P). Since Problem (Pα) is posed in the Hilbert space and we have assumed to be continuous, we can apply the Fenchel duality theorem directly, where we denote by the tracking term and by the two remaining terms in (Pα). To derive an explicit characterization of the second relation of (57), we set and use the fact that due to the Hilbert space setting, coincides with the Moreau envelope of , i.e.,
[TABLE]
see, e.g., [Bauschke, Proposition 13.12]. Hence, coincides with the Yoshida regularization of , i.e.,
[TABLE]
since the proximal mapping of an indicator function of a convex set is given by the metric projection onto ; see, e.g., [Bauschke, Proposition 12.29]. After some algebraic manipulations, we thus obtain the the optimality system
[TABLE]
where is to be understood pointwise almost everywhere in . Note that the system (OSα) coincides with the well-known projection formulation of the optimality condition for the control-constrained linear-quadratic problem (Pα); see, e.g., [TroBook, Theorem 2.28].
5.2 Measure-space control costs
We now connect problem (P) with the non-negative “sparse control problem”
[TABLE]
considered in [Clason:2011a]. Existence of an optimal control can be shown as in Theorem 3.4, using the fact that a minimizing sequence is necessarily bounded in by virtue of the additional (weak- lower semi-continuous) term. Similarly, by the minimizing property of , the family is bounded in and hence converges weakly- to in as (up to a subsequence if is not injective) if 3.1 holds and is continuous. If on the other hand is unbounded, the discussion in Section 5.1 shows that is in general not weakly- closed, and we cannot expect weak- convergence of to a minimizer .
Optimality conditions for (Pβ) with a bounded control-to-observation mapping can be derived by application of the Fenchel duality theorem, making use of the fact that the Fenchel conjugate of
[TABLE]
is given by
[TABLE]
see [Clason:2011a, Remark 2.5]. (Recall that by (56) the dual problem involves .) Fenchel duality now leads to the necessary optimality conditions
[TABLE]
see again [Clason:2011a, Remark 2.5], where the last relation was equivalently expressed as a variational inequality. Setting , we recover (66).
The optimality conditions (83) are frequently used as a justification for calling a sparse control: From the last relations, we see that must be zero on all subsets of where is strictly greater than . Hence, the support of is contained in the set , which in many situation (e.g., if is harmonic) can be argued to be a set of zero Lebesgue measure. Furthermore, increasing will decrease the size of this set. The same argument is possible for (66): the optimal control must be zero on all subsets with , and hence the support of is contained in (which has Lebesgue measure zero in similar situations as in the case ). This implies that optimal measure-space controls have an inherent sparsity independent of the sparsity-promoting control cost, whose role is solely to control the size of the support.
We can also apply our framework from Section 4 to derive optimality conditions for unbounded observation operators (which cannot be treated using the standard approach as in, e.g., [Clason:2011a]). Proceeding exactly as before with replaced by and replaced by , we obtain the modified optimality conditions
[TABLE]
Again setting , we recover (OS). However, since the last relation can no longer be interpreted pointwise, a sparsity property of does not follow directly.
6 Numerical solution
The numerical solution is based on the conforming discretization of introduced in [Clason:2012], which we briefly recall. The starting point is to replace by its finite element semidiscretization , where is a finite-dimensional space spanned by the usual continuous piecewise linear nodal basis (“hat”) functions attached to the vertices of a triangulation of . We then consider the semidiscrete optimal control problem
[TABLE]
Existence of an optimal control can be shown as in Section 3. Although the optimal state is unique, this is no longer the case for the control due to the finite number of observations. However, there is a unique with that can be represented as a linear combination of Dirac measures concentrated on the vertices contained in ; see [Clason:2012, Theorem 3.2]. We can thus restrict the minimization in (Ph) over the set of such linear combinations. In this sense, this approach is related to a discretization method introduced in [Winther:1978] for unconstrained linear-quadratic problems and also to the variational discretization of control-constrained problems of [Hinze2005].
This allows expressing Problem (Ph) purely in terms of the expansion coefficients of and of . Using that if and only if componentwise and applying the Fenchel duality theorem as in Corollary 4.10 (all finite-dimensional operators being bounded) yields the fully discrete optimality conditions
[TABLE]
where denotes the stiffness matrix corresponding to the differential operator , the restricted mass matrix on the observation domain , and the discrete restriction operator to the components of corresponding to vertices contained in . (Note the lack of mass matrix for the discrete state equation.) Since is a Hilbert space, we can reformulate the last relation in (OSh) using resolvent calculus similarly as in Section 5 as
[TABLE]
for any ; see also [Kunisch:2008a, Theorem 4.41]. (Comparing this relation with the last relation in (OSα), we remark that the only difference is the presence of on the right-hand side.) In particular, for we obtain
[TABLE]
where the is to be understood componentwise.
It is well-known that the operator is semismooth on with Newton derivative at in direction is given componentwise by
[TABLE]
and that system (OSh) therefore can be solved by a superlinearly convergent semismooth Newton method; see [Kunisch:2008a, Ulbrich:2002a]. To account for the local convergence of Newton methods, we compute a starting point by solving a sequence of discrete regularized problems analogous to Section 5. Specifically, we add for the penalty and proceed as in Section 5 to obtain
[TABLE]
Since the last relation is explicit, we can eliminate and apply a semismooth Newton method to the reduced system, starting with and successively reducing , taking for each the previous solution as starting point.
7 Numerical examples
We illustrate the nature of the generalized measure-space controls with numerical examples for the Laplace equation on the unit square with homogeneous Dirichlet conditions, i.e., we take and . The domain is discretized using the standard uniform triangulation arising from equidistributed nodes. The optimal controls for the discretized problem are computed using a matlab implementation of the approach described in Section 6, which can be downloaded from https://github.com/clason/positivecontrol.
