No-gap second-order conditions under $n$-polyhedric constraints and   finitely many nonlinear constraints

Gerd Wachsmuth

arXiv:1902.07750·math.OC·February 22, 2019

No-gap second-order conditions under $n$-polyhedric constraints and finitely many nonlinear constraints

Gerd Wachsmuth

PDF

Open Access

TL;DR

This paper establishes second-order optimality conditions for constrained optimization problems using the concept of n-polyhedricity, under weak regularity assumptions, and minimizes the gap between necessary and sufficient conditions.

Contribution

It introduces second-order optimality conditions under weak regularity assumptions using n-polyhedricity, reducing the gap between necessary and sufficient conditions.

Findings

01

Derived necessary first and second order optimality conditions.

02

Established sufficient optimality conditions with minimal gap.

03

Applied the concept of n-polyhedricity to nonlinear constraints.

Abstract

We consider an optimization problem subject to an abstract constraint and finitely many nonlinear constraints. Using the recently introduced concept of $n$ -polyhedricity, we are able to provide second-order optimality conditions under weak regularity assumptions. In particular, we prove necessary optimality conditions of first and second order under the constraint qualification of Robinson, Zowe and Kurcyusz. Similarly, sufficient optimality conditions are stated. The gap between both conditions is as small as possible.

Figures1

Click any figure to enlarge with its caption.

Equations266

f (x)

f (x)

x \in C

g_{i} (x) = 0, i = 1, \dots, m_{1},

g_{i} (x) \leq 0, i = m_{1} + 1, \dots, m .

X = L^{2} (Ω), C = {x \in L^{2} (Ω) ∣ x_{a} \leq x \leq x_{b}},

X = L^{2} (Ω), C = {x \in L^{2} (Ω) ∣ x_{a} \leq x \leq x_{b}},

\LL^{'} (\overset{x}{ˉ}, λ, μ) = f^{'} (\overset{x}{ˉ}) + λ + i = 1 \sum m μ_{i} g_{i}^{'} (\overset{x}{ˉ})

\LL^{'} (\overset{x}{ˉ}, λ, μ) = f^{'} (\overset{x}{ˉ}) + λ + i = 1 \sum m μ_{i} g_{i}^{'} (\overset{x}{ˉ})

0 \leq μ_{i}, μ_{i} g_{i} (\overset{x}{ˉ})

(λ, μ) \in Λ (\overset{x}{ˉ}) sup \LL^{''} (\overset{x}{ˉ}, λ, μ) h^{2} \geq 0 \forall h \in \TT_{C} (\overset{x}{ˉ}), f^{'} (\overset{x}{ˉ}) h = 0

(λ, μ) \in Λ (\overset{x}{ˉ}) sup \LL^{''} (\overset{x}{ˉ}, λ, μ) h^{2} \geq 0 \forall h \in \TT_{C} (\overset{x}{ˉ}), f^{'} (\overset{x}{ˉ}) h = 0

(λ, μ) \in Λ (\overset{x}{ˉ}) sup \LL^{''} (\overset{x}{ˉ}, λ, μ) h^{2} \geq α \norm h_{X}^{2} \forall h \in \TT_{C} (\overset{x}{ˉ}), f^{'} (\overset{x}{ˉ}) h = 0

(λ, μ) \in Λ (\overset{x}{ˉ}) sup \LL^{''} (\overset{x}{ˉ}, λ, μ) h^{2} \geq α \norm h_{X}^{2} \forall h \in \TT_{C} (\overset{x}{ˉ}), f^{'} (\overset{x}{ˉ}) h = 0

\RR_{D} (v)

\RR_{D} (v)

\TT_{D} (v)

\nu\anni:=\set{y\in Y\given\dual{\nu}{y}=0}.

\nu\anni:=\set{y\in Y\given\dual{\nu}{y}=0}.

B_{\varepsilon}(y):=\set[\big{]}{\hat{y}\in Y\given\norm{y-\hat{y}}_{Y}\leq\varepsilon}.

B_{\varepsilon}(y):=\set[\big{]}{\hat{y}\in Y\given\norm{y-\hat{y}}_{Y}\leq\varepsilon}.

K:=\set[\big{]}{z\in\R^{m}\given z_{i}=0,\;i=1,\ldots,m_{1},\;z_{i}\leq 0,\;i=m_{1}+1,\ldots,m}.

K:=\set[\big{]}{z\in\R^{m}\given z_{i}=0,\;i=1,\ldots,m_{1},\;z_{i}\leq 0,\;i=m_{1}+1,\ldots,m}.

I_{0} (x)

I_{0} (x)

\TT_{K} (g (x)) = \RR_{K} (g (x)) = {*} z \in R^{m} \given z_{i} z_{i} = 0 \forall i = 1, \dots, m_{1}, \leq 0 \forall i \in I_{0} (x) ∖ {1, \dots, m_{1}} .

\TT_{K} (g (x)) = \RR_{K} (g (x)) = {*} z \in R^{m} \given z_{i} z_{i} = 0 \forall i = 1, \dots, m_{1}, \leq 0 \forall i \in I_{0} (x) ∖ {1, \dots, m_{1}} .

\KK(\bar{x})=\set[\big{]}{h\in\TT_{C}(\bar{x})\given g^{\prime}(\bar{x})\,h\in\TT_{K}(g(\bar{x})),\;f^{\prime}(\bar{x})\,h\leq 0}.

\KK(\bar{x})=\set[\big{]}{h\in\TT_{C}(\bar{x})\given g^{\prime}(\bar{x})\,h\in\TT_{K}(g(\bar{x})),\;f^{\prime}(\bar{x})\,h\leq 0}.

\cl\paren[\big{]}{\RR_{C}(x)\cap\mu\anni}=\TT_{C}(x)\cap\mu\anni\qquad\forall\mu\in\NN_{C}(x).

\cl\paren[\big{]}{\RR_{C}(x)\cap\mu\anni}=\TT_{C}(x)\cap\mu\anni\qquad\forall\mu\in\NN_{C}(x).

\cl\paren[\big{]}{\RR_{C}(x)\cap\mu\anni}=\TT_{C}(x)\cap\mu\anni\qquad\forall\mu\in X\dualspace.

\cl\paren[\big{]}{\RR_{C}(x)\cap\mu\anni}=\TT_{C}(x)\cap\mu\anni\qquad\forall\mu\in X\dualspace.

\TT_{C}(x)\cap\bigcap_{i=1}^{n}\mu_{i}\anni=\cl\paren[\Big{]}{\RR_{C}(x)\cap\bigcap_{i=1}^{n}\mu_{i}\anni}\qquad\forall\mu_{1},\ldots,\mu_{n}\in X\dualspace

\TT_{C}(x)\cap\bigcap_{i=1}^{n}\mu_{i}\anni=\cl\paren[\Big{]}{\RR_{C}(x)\cap\bigcap_{i=1}^{n}\mu_{i}\anni}\qquad\forall\mu_{1},\ldots,\mu_{n}\in X\dualspace

\set[\Big{]}{h\in\RR_{C}(\bar{x})\given\dual{\nu_{i}}{h}=0,\;i=1,\ldots,n_{1},\;\dual{\nu_{i}}{h}\leq 0,\;i=n_{1}+1,\ldots,n}

\set[\Big{]}{h\in\RR_{C}(\bar{x})\given\dual{\nu_{i}}{h}=0,\;i=1,\ldots,n_{1},\;\dual{\nu_{i}}{h}\leq 0,\;i=n_{1}+1,\ldots,n}

\set[\Big{]}{h\in\TT_{C}(\bar{x})\given\dual{\nu_{i}}{h}=0,\;i=1,\ldots,n_{1},\;\dual{\nu_{i}}{h}\leq 0,\;i=n_{1}+1,\ldots,n}.

\set[\Big{]}{h\in\TT_{C}(\bar{x})\given\dual{\nu_{i}}{h}=0,\;i=1,\ldots,n_{1},\;\dual{\nu_{i}}{h}\leq 0,\;i=n_{1}+1,\ldots,n}.

\Omega_{\varepsilon}:=\set{\omega\in\Omega\given x_{a}(\omega)+\varepsilon\leq\bar{x}(\omega)\leq x_{b}(\omega)-\varepsilon}.

\Omega_{\varepsilon}:=\set{\omega\in\Omega\given x_{a}(\omega)+\varepsilon\leq\bar{x}(\omega)\leq x_{b}(\omega)-\varepsilon}.

\exists ε > 0, {h_{j}}_{j \in I_{0} (\overset{x}{ˉ})} \subset L^{\infty} (Ω_{ε}) : \forall i, j \in I_{0} (\overset{x}{ˉ}) : g_{i}^{'} (\overset{u}{ˉ}) h_{j} = δ_{ij} .

\exists ε > 0, {h_{j}}_{j \in I_{0} (\overset{x}{ˉ})} \subset L^{\infty} (Ω_{ε}) : \forall i, j \in I_{0} (\overset{x}{ˉ}) : g_{i}^{'} (\overset{u}{ˉ}) h_{j} = δ_{ij} .

L^{\infty}(\Omega_{\varepsilon})=\set{h\in L^{\infty}(\Omega)\given h=0\text{ a.e.\ in }\Omega\setminus\Omega_{\varepsilon}}.

L^{\infty}(\Omega_{\varepsilon})=\set{h\in L^{\infty}(\Omega)\given h=0\text{ a.e.\ in }\Omega\setminus\Omega_{\varepsilon}}.

μ_{j}

μ_{j}

μ_{j}

λ

f^{'} (\overset{x}{ˉ}) + λ + j = 1 \sum m μ_{j} g_{j}^{'} (\overset{x}{ˉ})

f^{''} (\overset{x}{ˉ}) h^{2} + j = 1 \sum m μ_{j} g_{j}^{''} (\overset{x}{ˉ}) h^{2} \geq 0 \forall h \in \KK (\overset{x}{ˉ}) \cap L^{\infty} (Ω) .

f^{''} (\overset{x}{ˉ}) h^{2} + j = 1 \sum m μ_{j} g_{j}^{''} (\overset{x}{ˉ}) h^{2} \geq 0 \forall h \in \KK (\overset{x}{ˉ}) \cap L^{\infty} (Ω) .

0\in\interior\paren[\Big{]}{g^{\prime}(\bar{x})\,\bracks{(C-\bar{x})\cap\lambda\anni}-\bracks{(K-g(\bar{x}))\cap\mu\anni}}

0\in\interior\paren[\Big{]}{g^{\prime}(\bar{x})\,\bracks{(C-\bar{x})\cap\lambda\anni}-\bracks{(K-g(\bar{x}))\cap\mu\anni}}

R^{m} = g^{'} (\overset{x}{ˉ}) \bracks \RR_{C} (\overset{x}{ˉ}) \cap λ \anni - \bracks \RR_{K} (\overset{x}{ˉ}) \cap μ \anni .

R^{m} = g^{'} (\overset{x}{ˉ}) \bracks \RR_{C} (\overset{x}{ˉ}) \cap λ \anni - \bracks \RR_{K} (\overset{x}{ˉ}) \cap μ \anni .

f^{''} (\overset{x}{ˉ}) h^{2} + j = 1 \sum m μ_{j} g_{j}^{''} (\overset{x}{ˉ}) h^{2} \geq 0 \forall h \in \KK (\overset{x}{ˉ}) .

f^{''} (\overset{x}{ˉ}) h^{2} + j = 1 \sum m μ_{j} g_{j}^{''} (\overset{x}{ˉ}) h^{2} \geq 0 \forall h \in \KK (\overset{x}{ˉ}) .

f (x)

f (x)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Variational Analysis

Full text

\addbibresource

world.bib

No-gap second-order conditions under $n$ -polyhedric constraints and finitely many nonlinear constraints

Gerd Wachsmuth

Abstract

We consider an optimization problem subject to an abstract constraint and finitely many nonlinear constraints. Using the recently introduced concept of $n$ -polyhedricity, we are able to provide second-order optimality conditions under weak regularity assumptions. In particular, we prove necessary optimality conditions of first and second order under the constraint qualification of Robinson, Zowe and Kurcyusz. Similarly, sufficient optimality conditions are stated. The gap between both conditions is as small as possible.

keywords:

second order optimality condition, critical cone, polyhedricity, Legendre form, non-unique multiplier

{msc}\mscLink

49K27

1 Introduction

In this work, we are interested in problems of type

[TABLE]

Our goal is the derivation of second-order necessary and sufficient optimality conditions with minimal gap. Here, $f\colon X\to\R$ and $g_{i}\colon X\to\R$ , $i=1,\ldots,m$ , are twice Fréchet differentiable, $X$ is a Banach space and $C\subset X$ is assumed to be closed and convex. In particular, we interested in the situation that

[TABLE]

where $(\Omega,\Sigma,\nu)$ is a finite measure space and $x_{a},x_{b}\colon\Omega\to\R\cup\{\pm\infty\}$ are measurable functions such that $C$ becomes non-empty.

It is clear that first-order necessary optimality conditions for (P) can be obtained by using the constraint qualification of Robinson-Zowe-Kurcyusz (RZKCQ), see \citeZoweKurcyusz1979,Robinson1976:1. There are a couple of papers available in which second-order conditions for problems of type (P) are considered. We mention exemplarily \citeBonnansZidani1999,CasasTroeltzsch2002. However, in these papers, the authors have to impose additional regularity assumptions to arrive at second-order necessary conditions. These regularity conditions are rather strong and, in particular, they imply uniqueness of Lagrange multipliers. Thus, these conditions cannot be satisfied in situations in which the Lagrange multipliers associated with a stationary point are not unique. The case of infinite-dimensional equality constraints is considered in \citeIoffe1979.

In this paper, we are utilizing the recently introduced notion of $n$ -polyhedricity, see \cite[Definition 4.3]Wachsmuth2016:2 and \crefsubsec:n_polyhedricity below, to derive second-order necessary conditions. Note that the set $C$ from (1) is $n$ -polyhedric for all non-negative integers $n$ , see \cite[Example 4.21(1)]Wachsmuth2016:2. Let $\bar{x}$ be a local optimizer of (P). Under the RZKCQ, there exist multipliers $\lambda\in\NN_{C}(\bar{x})$ , $\mu\in\R^{m}$ such that

[TABLE]

Here, $\LL$ denotes the Lagrangian (see (14) below) and $\NN_{C}(\bar{x})$ is the normal cone of $C$ at the point $\bar{x}$ . The set of all multipliers satisfying the above conditions is denoted by $\Lambda(\bar{x})$ . Our main contributions are the following. We shall show that the condition

[TABLE]

is necessary for local optimality if RZKCQ is satisfied and if the set $C$ is $n$ -polyhedric, where $n$ is bigger than the number of active constraints. If, additionally, a quadratic growth condition is satisfied at $\bar{x}$ , we can show

[TABLE]

for some $\alpha>0$ . Under a slight additional assumption, this last condition is also sufficient for the local optimality of $\bar{x}$ .

The paper is organized as follows. In \crefsec:not we introduce the necessary notation and review some known results. The well-known first-order optimality conditions are given in \crefsec:foc. The main results of this paper concerning the second-order conditions for (P) are given in \crefsec:soc. In \crefsec:examples we present two examples which indicate that the results in this paper are sharp. The first example shows that the supremum in (2) is really necessary if the Lagrange multipliers are not unique. The second example demonstrates that the $n$ -polyhedricity assumption on $C$ is crucial and cannot be replaced by requiring polyhedricity only.

2 Notation, preliminaries and known results

2.1 Notation

We use the definitions $\N:=\{1,\ldots\}$ and $\N_{0}:=\{0,1,\ldots\}$ .

For a convex subset $D\subset Y$ of a Banach space $Y$ and $v\in D$ , we define the radial cone, the tangent cone, the normal cone and the polar cone via

[TABLE]

respectively. The annihilator of a functional $\nu\in Y\dualspace$ is defined as

[TABLE]

For $\varepsilon>0$ and $y\in Y$ , we define the closed ball

[TABLE]

In order to discuss (P), it will be convenient to define $K\subset\R^{m}$ via

[TABLE]

Moreover, we consider $g=(g_{1},\ldots,g_{m})$ as a function from $X$ to $\R^{m}$ . Let a point $x\in X$ with $g(x)\in K$ be given. Using the active and inactive sets of indices, defined via

[TABLE]

respectively, it is easy to check that

[TABLE]

Moreover, for a feasible point $\bar{x}$ of (P) we define the critical cone $\KK(\bar{x})$ via

[TABLE]

2.2 On $n$ -polyhedricity

As mentioned in the introduction, we are going to employ the concept of $n$ -polyhedricity to derive second-order conditions for (P). The notion of $n$ -polyhedricity was recently introduced in \citeWachsmuth2016:2 and generalizes the well-known notion of polyhedricity due to \citeMignot1976,Haraux1977.

We recall that a closed convex set $C\subset X$ is called polyhedric at $x\in C$ if

[TABLE]

It was shown in \cite[Lemma 4.1]Wachsmuth2016:2 that this condition equivalent to

[TABLE]

The latter condition is amenable to the following generalization. We say that $C\subset X$ is $n$ -polyhedric at $x\in C$ for some $n\in\N_{0}$ , if

[TABLE]

holds, see \cite[Definition 4.3]Wachsmuth2016:2. Many sets which were known to be polyhedric are even $n$ -polyhedric for all $n\in\N_{0}$ , see, e.g., \cite[Example 4.21]Wachsmuth2016:2. In particular, this applies to the set of interest $C$ from (1).

We provide a lemma, which follows from a simple calculation, see also \cite[Lemma 4.4]Wachsmuth2016:2.

Lemma 2.1.

Assume that the set $C\subset X$ is $N$ -polyhedric for some $N\in\N_{0}$ at $\bar{x}\in C$ . Further, let $n,n_{1}\in\N_{0}$ , $\nu_{i}\in X\dualspace$ for $1\leq i\leq n$ be given such that $N\geq n\geq n_{1}$ . Then, the set

[TABLE]

is dense in

[TABLE]

2.3 Review of known results

We start by reviewing the results of \citeCasasTroeltzsch2002, see also \citeCasasTroeltzsch1999. In this paper, the authors studied a problem very similar to (P) with (1). However, they considered the situation in which the underlying space $X$ is a Lebesgue space $L^{\infty}(\Omega)$ and their analysis incorporates the important phenomenon of two-norm discrepancy. In the situation in which all functions are already differentiable in $X=L^{2}(\Omega)$ , the problem of \citeCasasTroeltzsch2002 coincides with (P). The main assumption for deriving second-order necessary conditions is a regularity assumption on the solution $\bar{x}$ . For $\varepsilon>0$ , the $\varepsilon$ -inactive set is defined via

[TABLE]

With this notation, the regularity condition is given by

[TABLE]

Here, we used the notation

[TABLE]

Under the regularity assumption (5), \citeCasasTroeltzsch2002 prove the existence of unique multipliers $\mu\in\R^{m}$ , $\lambda\in L^{2}(\Omega)$ such that

[TABLE]

Moreover, they prove the second-order necessary condition

[TABLE]

The appearance of $L^{\infty}(\Omega)$ in this formula comes through the general setting of \citeCasasTroeltzsch2002 which includes the two-norm discrepancy. We mention that also sufficient second-order conditions are derived.

Next, we review the results of \citeBonnansZidani1999. In this work, a problem slightly more general than (P) is considered. In fact, the nonlinear constraints are replaced by $G(x)\in K_{Y}$ , where $G$ is twice Fréchet differentiable and $K_{Y}$ is a closed convex set in the Banach space $Y$ . However, the strongest results are obtained in the case that $K_{Y}$ is a polyhedron, i.e., a finitely intersection of closed half-spaces, and this is very similar to (P). To facilitate the comparison with our results, we apply their results to our problem (P). In this case, they use the regularity condition

[TABLE]

for a given KKT multiplier $(\lambda,\mu)$ . Via the generalized open mapping theorem from \citeZoweKurcyusz1979, this condition is equivalent to

[TABLE]

In the literature, this condition is often called “strict qualification condition”. To our knowledge, this condition appears first in \cite[Theorem 3.3]MaurerZowe1979. Moreover, it is known that this condition implies the uniqueness of the multipliers $(\lambda,\mu)$ , see \citeShapiro1997. Moreover, it is straightforward to check that (5) is strictly stronger than (8). Under condition (8), \cite[Theorem 2.7(iii)]BonnansZidani1999 gives the second-order necessary condition

[TABLE]

Under the additional assumption that the second derivative of the Lagrangian is a Legendre form, they also derive sufficient conditions. Consequently, the gap between necessary and sufficient conditions of second order is as small as possible.

Using the inheritance property \cite[Lemma 3.3]Wachsmuth2016:2 of polyhedric sets, it is possible to generalize the results of \citeBonnansZidani1999 in the following way. Instead of (P), we consider the much more general problem

[TABLE]

Here, $X$ , $Y$ are a Banach spaces, $f\colon X\to Y$ $G\colon X\to Y$ are twice Fréchet differentiable and $C\subset X$ , $D\subset Y$ are closed, convex and polyhedric sets. Given multipliers $(\lambda,\mu)\in\NN_{C}(\bar{x})\times\NN_{D}(G(\bar{x}))$ , the condition (8) becomes

[TABLE]

Under this condition, we can apply \cite[Theorem 5.4]Wachsmuth2016:2 and obtain the second-order necessary condition

[TABLE]

Note that one has to rewrite the constraints as $\hat{G}(x):=(x,G(x))\in C\times D=:\hat{K}$ to apply this theorem. Thus, if this strong regularity condition (10) is satisfied, we can replace the assumption of $K$ being polyhedral in \citeBonnansZidani1999 by the much weaker assumption of polyhedricity. We note that also necessary conditions of second order can be found in \cite[Theorems 5.6, 5.7]Wachsmuth2016:2.

3 First-order optimality conditions and constraint qualifications

In this section, we briefly recall first-order optimality conditions for the problem (P) and the constraint qualifications which are required for the derivation. In order to put our problem into the framework of \citeZoweKurcyusz1979, we recall

[TABLE]

and $g=(g_{1},\ldots,g_{m})$ . Now, our problem (P) reads

[TABLE]

An application of \cite[Theorem 3.1]ZoweKurcyusz1979 implies the following first-order necessary conditions.

Theorem 3.1.

Assume that $\bar{x}\in X$ is a local minimizer of (P) such that

[TABLE]

is satisfied. Then, there exist $\lambda\in\NN_{C}(\bar{x})$ , $\mu\in\NN_{K}(g(\bar{x}))$ such that

[TABLE]

It is clear that $\mu\in\NN_{K}(g(\bar{x}))$ is equivalent to

[TABLE]

Further, condition (13) can be written concisely as

[TABLE]

where the Lagrangian $\LL\colon X\times X\dualspace\times\R^{m}\to\R$ is defined via

[TABLE]

and a prime denotes partial differentiation w.r.t. $x$ . For convenience, we recall the expressions for the first and second derivative of the Lagrangian $\LL$ w.r.t. $x$

[TABLE]

for $h\in X$ . Here, we used the common abbreviation $h^{2}$ for the action of a bilinear form on the tuple $[h,h]$ .

For an arbitrary feasible point $x$ , we define the set of Lagrange multipliers via

[TABLE]

We also recall from \cite[Theorem 4.1]ZoweKurcyusz1979 that (RZKCQ) implies the boundedness of $\Lambda(\bar{x})$ . We mentioned that the boundedness of $\Lambda(\bar{x})$ can be shown under the slightly weaker condition

[TABLE]

by a suitable modification of the proof of \cite[Theorem 4.1]ZoweKurcyusz1979. Note that, however, $\Lambda(\bar{x})$ might be empty if only (16) is satisfied.

4 No-gap second-order optimality conditions

In this section, we consider second-order optimality conditions for problem (P).

We begin by the derivation of necessary optimality conditions. In order to apply the results from \cite[Section 3.2.3]BonnansShapiro2000, we introduce

[TABLE]

Now, (P) reads

[TABLE]

Let $\bar{x}$ be a feasible point of (P). From \cite[(3.20) and (3.122)]BonnansShapiro2000, we recall the definition of the critical cone

[TABLE]

which matches our definition (4), and of the set of radial critical directions

[TABLE]

Note that we have

[TABLE]

due to (3).

From \cite[Proposition 3.53]BonnansShapiro2000 we get the following result.

Lemma 4.1.

Assume that $\bar{x}$ is a local minimizer of (P) such that (RZKCQ) is satisfied. Further suppose that $\KK_{R}(\bar{x})$ is dense in $\KK(\bar{x})$ . Then,

[TABLE]

The density assumption in this result can be shown under an additional condition on the constraint set $C$ .

Theorem 4.2.

Assume that $\bar{x}$ is a local minimizer of (P) such that (RZKCQ) is satisfied. We denote by $\hat{m}$ the number of active constraints in $\bar{x}$ , i.e., the number of indices $i=1,\ldots,m$ with $g_{i}(\bar{x})=0$ . Under the assumption that $C$ is $(\hat{m}+1)$ -polyhedric, we have

[TABLE]

Proof 4.3.

We recall the formula

[TABLE]

for the tangent cone of $K$ , where $I_{0}(\bar{x})$ denotes the set of active indices. For brevity, we set $\hat{I}_{0}(\bar{x}):=I_{0}(\bar{x})\setminus\{1,\ldots,m_{1}\}$ . Thus,

[TABLE]

In these sets, we have $\hat{m}+1$ many scalar equalities and inequalities. Due to the assumption that $C$ is $(\hat{m}+1)$ -polyhedric, we can invoke \creflem:n_polyhedricity. This implies that $\KK_{R}(\bar{x})$ is dense in $\KK(\bar{x})$ . Thus, the assertion follows from \creflem:necessary_condition.

Note that the supremum in the above inequality is attained, since the set of multipliers is weak- $\star$ compact, see \cite[Theorem 3.9]BonnansShapiro2000, and the second derivative of the Lagrangian is weak- $\star$ continuous w.r.t. the multipliers. Hence, (18) can be rephrased as follows. For every critical direction $h\in\KK(\bar{x})$ , there exist multipliers $(\lambda,\mu)\in\Lambda(\bar{x})$ such that $\LL^{\prime\prime}(\bar{x},\lambda,\mu)\,h^{2}\geq 0$ .

If a quadratic growth condition is satisfied at $\bar{x}$ , we get a better inequality.

Corollary 4.4.

Additionally to the assumptions of \crefthm:SNC, we assume that the growth condition

[TABLE]

is satisfied for some $\alpha,\varepsilon>0$ at $\bar{x}$ , where $F=\set{x\in C\given g(x)\in K}$ is the feasible set of (P). Then,

[TABLE]

Proof 4.5.

Under (19), $\bar{x}$ is a local minimizer of $\hat{f}(x):=f(x)-\frac{\alpha}{2}\,\norm{x-\bar{x}}_{X}^{2}$ on $F$ . Note that $\hat{f}$ is twice Fréchet differentiable if $X$ is a Hilbert space. In this case, a direct application of \crefthm:SNC yields the claim. If $X$ is not a Hilbert space, we can still reproduce \cite[Lemma 3.44]BonnansShapiro2000, which is enough to prove \cite[Prop. 3.53]BonnansShapiro2000 and, consequently, \crefthm:SNC. To this end, we set $\tilde{f}(x):=\frac{1}{2}\,\norm{x-\bar{x}}_{X}^{2}$ and check that a second-order Taylor expansion similar to \cite[(3.100)]BonnansShapiro2000 holds. To this end, let $h,w\in X$ and $r:(0,\infty)\to X$ be given such that $r(t)=\oo(t^{2})$ . We define the path $x(t):=\bar{x}+t\,h+\frac{1}{2}\,t^{2}\,w+r(t)$ . Then,

[TABLE]

as $t\searrow 0$ . Hence, the modified function $\tilde{f}$ satisfies the required second-order Taylor expansion.

As usual, second-order sufficient conditions can be derived by a contradiction argument.

Theorem 4.6.

Assume that $\bar{x}$ is a stationary point of (P), i.e., there exist $(\lambda,\mu)\in\Lambda(\bar{x})$ . Further, we suppose that the CQ (16), which is slightly weaker than Robinson’s CQ, be satisfied. We assume that

[TABLE]

holds for some $\alpha,\eta>0$ , where the extended critical cone $\KK_{\eta}(\bar{x})$ is given by

[TABLE]

Then, for all $\tilde{\alpha}\in(0,\alpha)$ , there is $\varepsilon>0$ such that

[TABLE]

where $F=\set{x\in C\given g(x)\in K}$ is the feasible set of (P).

Proof 4.7.

We fix $\tilde{\alpha}\in(0,\alpha)$ and proceed by contradiction. This yields a sequence $x_{n}\in F\setminus\{\bar{x}\}$ with $x_{n}\to\bar{x}$ and $f(x_{n})<f(\bar{x})+\frac{\tilde{\alpha}}{2}\,\norm{x_{n}-\bar{x}}_{X}^{2}$ .

Using the Fréchet differentiability of $g$ , we have

[TABLE]

Owing to the CQ and the generalized open mapping theorem \cite[Theorem 2.1]ZoweKurcyusz1979, we find sequences $\{h_{n}\}\subset\TT_{C}(\bar{x})$ , $\{v_{n}\}\subset\TT_{K}(g(\bar{x}))$ with

[TABLE]

and $h_{n}=\OO(\norm{r_{n}}_{X})=\oo(\norm{x_{n}-\bar{x}}_{X})$ . In particular, $x_{n}-\bar{x}+h_{n}\in\TT_{C}(\bar{x})$ and

[TABLE]

Further,

[TABLE]

yields $f^{\prime}(\bar{x})\,(x_{n}-\bar{x}+h_{n})=\oo(\norm{x_{n}-\bar{x}}_{X})=\oo(\norm{x_{n}-\bar{x}+h_{n}}_{X})$ . Hence, $x_{n}-\bar{x}+h_{n}\in\KK_{\eta}(\bar{x})$ for $n$ large enough.

Now, for large $n$ , we choose $(\lambda_{n},\mu_{n})\in\Lambda(\bar{x})$ , such that

[TABLE]

This is possible since $x_{n}-\bar{x}+h_{n}\neq 0$ for $n$ large enough.

For $n$ large enough we have

[TABLE]

Next, we are going to use a Taylor expansion of the Lagrangian. Since $f$ and $g$ are twice Fréchet differentiable, we have the Taylor expansion

[TABLE]

and analogously for $g$ . Now, we utilize that the CQ (16) implies the boundedness of the multipliers $\Lambda(\bar{x})$ . This yields that we can use a Taylor expansion for $\LL(\cdot,\lambda_{n},\mu_{n})$ at $\bar{x}$ and the remainder term is uniform w.r.t. the multipliers $(\lambda_{n},\mu_{n})\in\Lambda(\bar{x})$ . Thus, we can continue with

[TABLE]

In order to deal with the second and third addend, we use again the boundedness of $\Lambda(\bar{x})$ . Together with $\norm{h_{n}}_{X}=\oo(\norm{x_{n}-\bar{x}}_{X})$ , both addends belong to $\oo(\norm{x_{n}-\bar{x}}_{X}^{2})$ as $n\to\infty$ . Thus, we can continue via

[TABLE]

Dividing by $\norm{x_{n}-\bar{x}}_{X}^{2}$ and passing to the limit $n\to\infty$ yields the contradiction $\tilde{\alpha}/2\geq(\alpha+\tilde{\alpha})/4$ .

Using the notion of Legendre forms, it possible to weaken the assumed inequality (20). We recall from \cite[Section 6.2]IoffeTichomirov1979:2 that a continuous bilinear form $a:H\times H\to\R$ on a Hilbert space $H$ is called a Legendre form, if $x\mapsto a(x,x)$ is sequentially weakly lower semicontinuous and if

[TABLE]

Clearly, this definition can also be used if $H$ is not a Hilbert space, but only a Banach space. However, it was shown recently in \citeHarder2018 that a reflexive Banach space permits a Legendre form only if it possesses an equivalent Hilbert space norm. The notion of Legendre forms was generalized to non-quadratic forms in \cite[Definition 3.73]BonnansShapiro2000. Therein, a function $q:X\to\R$ is called an extended Legendre form, if it is weakly lower semicontinuous, positively homogeneous of degree $2$ and if

[TABLE]

is satisfied. We are interested in the case that

[TABLE]

is the maximized Hessian of the Lagrangian. Under the assumption that the set of multipliers $\Lambda(\bar{x})$ is bounded, which holds, e.g., under (16), and non-empty, the function $q$ is finite, i.e., it maps $X$ to $\R$ . The next results states necessary conditions which ensure that a sum of two functions is an extended Legendre form. It is inspired by \cite[Proposition 3.76 (ii)]BonnansShapiro2000.

Lemma 4.8.

Suppose that $q_{1}:X\to\R$ is an extended Legendre form and that $q_{2}:X\to\R$ is positively homogeneous of degree $2$ and weakly lower semicontinuous. Then, $q:=q_{1}+q_{2}$ is an extended Legendre form.

Proof 4.9.

It is clear that $q$ is positively homogeneous of degree $2$ and weakly lower semicontinuous. Now, suppose that $x_{n}\weakly x$ and $q(x_{n})\to q(x)$ . From

[TABLE]

we infer $q_{1}(x_{n})\to q_{1}(x)$ . Since $q_{1}$ is an extended Legendre form, $x_{n}\to x$ follows. This shows that $q$ is an extended Legendre form.

The next result is an adaption of \cite[Proposition 3.77]BonnansShapiro2000 to the situation at hand.

Lemma 4.10.

Let $\bar{x}$ be a feasible point such that $\Lambda(\bar{x})$ is not empty and bounded. Further, we assume that $f^{\prime\prime}(\bar{x})$ is a Legendre form and that

•

$h\mapsto g_{i}^{\prime\prime}(\bar{x})\,h^{2}$ * is weakly continuous for all $i=1,\ldots,m_{1}$ and*

•

$h\mapsto g_{i}^{\prime\prime}(\bar{x})\,h^{2}$ * is weakly lower semicontinuous for all $i\in I_{0}(\bar{x})\setminus\{1,\ldots,m_{1}\}$ .*

Then, the function $q$ defined in (21) is an extended Legendre form.

Proof 4.11.

For every $(\lambda,\mu)\in\Lambda(\bar{x})$ , the function $h\mapsto q_{2}^{(\mu)}(h):=\sum_{i=1}^{m}\mu_{i}\,g_{i}^{\prime\prime}(\bar{x})\,h^{2}$ is weakly lower semicontinuous, since $\mu_{i}\geq 0$ for $i\in I_{0}(\bar{x})\setminus\{1,\ldots,m_{1}\}$ . Moreover, these functions are positively $2$ -homogeneous. As the supremum of weakly lower semicontinuous functions, the function

[TABLE]

is weakly lower semicontinuous. Now, an application of \creflem:ex_leg_form yields the assertion.

From \cite[Lemma 3.75]BonnansShapiro2000, we obtain the following result.

Lemma 4.12.

Let $X$ be a reflexive Banach space. Suppose that (16) is satisfied at the feasible point $\bar{x}$ and that $\Lambda(\bar{x})$ is not empty. We further assume that

[TABLE]

is an extended Lagrange form. Then, the condition (20) is equivalent to

[TABLE]

In this case, we have a minimal gap between the necessary and sufficient conditions of \crefthm:SNC,thm:sufficient_condition.

5 Examples

In this section, we provide two examples. These examples illustrate two crucial ingredients of \crefthm:SNC.

The first example is constructed in such a way that the assumptions of \crefthm:SNC are satisfied and, hence, the necessary conditions (18) hold. However, the set of multipliers $\Lambda(\bar{x})$ is not a singleton and the condition

[TABLE]

is violated for all $(\lambda,\mu)\in\Lambda(\bar{x})$ . Hence, it is crucial to take the supremum over all multipliers in (18).

In the other example, we demonstrate that the assumption that $C$ is $(\hat{m}+1)$ -polyhedric is crucial. To this end, we have to use a polyhedric set which is not $2$ -polyhedric.

5.1 Non-unique multipliers

This example is heavily inspired by \cite[Counterexample 1.2]CrouzeixMartinezLegazSeeger1995. We repeat this counterexample, since it will be important in the sequel. We define the matrices

[TABLE]

These matrices have the property that

[TABLE]

Indeed, this can be shown, e.g., by a distinction of the cases $x_{1}\geq x_{2}$ and $x_{2}\geq x_{1}$ . However, for every $\lambda\in[0,1]$ , the convex combination

[TABLE]

is not coercive on non-negative vectors, since at least one of the numbers

[TABLE]

will be negative.

We are going to construct a problem of the form

[TABLE]

Here, $f,g\colon L^{2}(0,1)\to\R$ are (continuous) quadratic functions to be defined below and

[TABLE]

Our point of interest will be $\bar{x}\in C$ defined via

[TABLE]

It is clear that

[TABLE]

The function $g$ will satisfy

[TABLE]

The first conditions renders $\bar{x}$ feasible for (24). Due to

[TABLE]

it is easy to check that

[TABLE]

thus, (RZKCQ) is satisfied. Next, we require

[TABLE]

and we compute the set of Lagrange multipliers $\Lambda(\bar{x})$ . This amounts to find all $\mu\in\R$ , such that the corresponding $\lambda$ satisfies

[TABLE]

By using the formula for the normal cone, we see that this is equivalent to $\mu\in[0,1]$ . Thus $\bar{x}$ is a stationary point and

[TABLE]

Note that the critical cone $\KK(\bar{x})=\TT_{C}(\bar{x})\cap f^{\prime}(\bar{x})\anni$ is given by

[TABLE]

Next, we define the second derivatives of $f$ and $g$ at $\bar{x}$ . To this end, we use the notation

[TABLE]

for the average of a function $x$ over an interval $(a,b)$ and for the difference of the function with this average. With this notation, we introduce

[TABLE]

Note that the quadratic functions $f$ and $g$ are uniquely determined via the first and second derivatives in $\bar{x}$ and the requirement $f(\bar{x})=g(\bar{x})=0$ . Let us check that the second-order sufficient condition (20) is satisfied. For $h\in\TT_{C}(\bar{x})$ we set $\hat{h}:=(\fint_{0}^{1/3}h_{2}\,\dt,-\fint_{3/4}^{1}h_{2}\,\dt)\geq 0$ . By utilizing (23), we have

[TABLE]

Hence, \crefthm:sufficient_condition implies that $\bar{x}$ is a local minimizer.

It remains to check that the condition

[TABLE]

is violated for all $(\lambda,\mu)\in\Lambda(\bar{x})$ . To this end, we take

[TABLE]

and observe $h_{1},h_{2}\in\KK(\bar{x})$ . It is easy to check that

[TABLE]

for $i=1,2$ . Thus, for every $\mu\in[0,1]$ and the associated $\lambda$ , we have

[TABLE]

and these two terms cannot be simultaneously non-negative. Hence, there does not exist any multiplier $(\lambda,\mu)\in\Lambda(\bar{x})$ such that (25) holds. This means that (25) fails to be a necessary optimality condition.

5.2 Constraint set which is not $2$ -polyhedric

Next, we give a counterexample to demonstrate that the assumption of $C$ being $(\hat{m}+1)$ -polyhedric in \crefthm:SNC is crucial. Therefore, we need a set which is polyhedric (i.e., $1$ -polyhedric), but not $2$ -polyhedric. To the best of our knowledge, the set given in \cite[Example 4.24]Wachsmuth2016:2 is the only known set with this property. In order to state our counterexample, we need to adapt the construction from \cite[Example 4.24]Wachsmuth2016:2. In $\R^{3}$ we consider the points

[TABLE]

where $\gamma=(1+\sqrt{3})/2$ . We set

[TABLE]

Since the sequences $\{P_{n}\}$ and $\{Q_{n}\}$ converge towards $O$ , the set $C$ is closed. In what follows, we check that $C$ is polyhedric. By arguing as in \cite[Example 4.24]Wachsmuth2016:2, we find that $C$ is polyhedric in $O$ . Next, it is a little bit tedious to check that $C$ is the intersection of the half-spaces which are defined by the following inequalities and that the points on the right-hand side are exactly those points of $O$ , $P_{n}$ , $Q_{n}$ which lie on the boundary of the half-spaces:

[TABLE]

where $k\in\N$ . In the last two lines, we have used the coefficients

[TABLE]

From this representation of $C$ , we learn two things. First, all $P_{k}$ , $Q_{k}$ are extreme points of $C$ and, thus, $C$ is not polyhedral. Second, the intersection $C\cap\{x\in\R^{3}\mid x^{\top}(1,1,1)\geq\varepsilon\}$ is a polyhedron for all $\varepsilon>0$ , since it can be written as a finite intersection of half-spaces. Thus, $\RR_{C}(x)$ is closed for all $x\in C\setminus\{O\}$ . Hence, $C$ is polyhedric at all $x\in C\setminus\set{O}$ .

Hence, we have shown that $C$ is polyhedric, but not polyhedral. As in \cite[Example 4.24]Wachsmuth2016:2, we can also check that $C$ is not $2$ -polyhedric.

Next, we compute the intersection of $C$ with the hyperplane $x^{\top}(1,0,0)=0$ . To this end, let $R_{k,n}$ be the intersection of this hyperplane with the line segment joining $P_{k}$ and $Q_{n}$ , i.e.,

[TABLE]

One can check that

[TABLE]

We define

[TABLE]

and claim that all points $R_{k,n}$ belong to the convex set

[TABLE]

and that the points $R_{n,n}$ belong to the relative boundary of this set. Indeed, after a straightforward manipulation, this claim is equivalent to the inequality

[TABLE]

and that we have equality for $k=n$ . This latter equality is clear. Moreover, one can check that for $n\geq k$ the derivative w.r.t. $n$ and for $k\geq n$ the derivative w.r.t. $k$ of the left-hand side is non-negative, both by using the definition of $\gamma$ .

Now, we consider the optimization problem

[TABLE]

In order to cast this problem in the form (12), we set $g(x)=x_{1}$ and $K=\{0\}$ . The feasible set of this problem is $C\cap(1,0,0)\anni$ and this set is contained in $M$ . Hence,

[TABLE]

shows that $\bar{x}=(0,0,0)$ is a local minimizer of the above problem. Since $P_{1}\in C$ has a positive $x_{1}$ -coordinate and since $Q_{1}\in C$ has a negative $x_{1}$ -coordinate, it is easy to check that (RZKCQ), i.e.,

[TABLE]

is satisfied. Hence, there exist $\lambda\in\NN_{C}(\bar{x})$ , $\mu\in\R$ such that the necessary condition from \crefthm:fonc, i.e.,

[TABLE]

is satisfied. Finally, we check that the necessary optimality condition of second order (18) does not hold. Since the constraint $g$ is linear, its second derivative vanishes and the precise value of the multiplier $\mu$ is irrelevant. Next, we construct an element of the tangent cone $\TT_{C}(\bar{x})$ . From

[TABLE]

we find that $h:=(0,1,0)\in\TT_{C}(\bar{x})$ . Moreover, $f^{\prime}(\bar{x})\,h=0$ and $g^{\prime}(\bar{x})\,h=0$ are clear. Thus, $h$ belongs to the critical cone $\KK(\bar{x})$ , cf. (17). However,

[TABLE]

is negative. Hence, (18) is violated.

We mention that the only assumption of \crefthm:SNC which does not hold is the assumption that $C$ is $2$ -polyhedric. Hence, this assumption is essential. On the other hand, in the context of \citeBonnansZidani1999, the only assumption which might not hold is the satisfaction of the regularity condition (8). We check that this condition indeed fails. To this end, we start by computing the set of multipliers. It is clear that $(\lambda,\mu)$ are multipliers at $\bar{x}$ , if and only if the two conditions

[TABLE]

hold. Due to the construction of the set $C$ and due to $\bar{x}=(0,0,0)$ , we have

[TABLE]

Hence, $\mu\in\R$ has to satisfy the inequalities

[TABLE]

Hence, $\mu=0$ and $\lambda=(0,0,1)^{\top}$ are the unique Lagrange multipliers for $\bar{x}$ . Finally, the regularity condition (8) is violated, since

[TABLE]

This example also shows that assuming (8) in \citeBonnansZidani1999 cannot be replaced by the assumption of unique multipliers.

6 Conclusions

We have investigated problem (P) featuring an abstract constraint $x\in C$ and finitely many nonlinear constraints $g(x)\in K$ . Previously, second-order necessary optimality conditions have been obtained under the rather strong regularity condition (8). We propose to use the concept of $n$ -polyhedricity of $C$ as a novel approach for deriving second-order necessary conditions. In fact, “almost all” sets which are known to be polyhedric are even $n$ -polyhedric, see, e.g., \cite[Example 4.21]Wachsmuth2016:2. This allows us to prove second-order necessary conditions under the assumption of the CQ of Robinson, Zowe and Kurcyusz. Second-order sufficient conditions can be obtained by the usual contradiction argument. By means of two counterexamples, we have seen that the assumptions and the formulation of \crefthm:SNC is sharp. The inclusion of the phenomenon of two-norms discrepancy is subject to future research. It would also be interesting to replace the finite-dimensional polyhedral cone $K$ by a set involving curvature, e.g., the cone of semi-definite matrices.

\printbibliography

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

No-gap second-order conditions under nnn-polyhedric constraints and finitely many nonlinear constraints

Abstract

keywords:

1 Introduction

2 Notation, preliminaries and known results

2.1 Notation

2.2 On nnn-polyhedricity

Lemma 2.1**.**

2.3 Review of known results

3 First-order optimality conditions and constraint qualifications

Theorem 3.1**.**

4 No-gap second-order optimality conditions

Lemma 4.1**.**

Theorem 4.2**.**

Proof 4.3**.**

Corollary 4.4**.**

Proof 4.5**.**

Theorem 4.6**.**

Proof 4.7**.**

Lemma 4.8**.**

Proof 4.9**.**

Lemma 4.10**.**

Proof 4.11**.**

Lemma 4.12**.**

5 Examples

5.1 Non-unique multipliers

5.2 Constraint set which is not 222-polyhedric

6 Conclusions

No-gap second-order conditions under $n$ -polyhedric constraints and finitely many nonlinear constraints

2.2 On $n$ -polyhedricity

Lemma 2.1.

Theorem 3.1.

Lemma 4.1.

Theorem 4.2.

Proof 4.3.

Corollary 4.4.

Proof 4.5.

Theorem 4.6.

Proof 4.7.

Lemma 4.8.

Proof 4.9.

Lemma 4.10.

Proof 4.11.

Lemma 4.12.

5.2 Constraint set which is not $2$ -polyhedric