Criticality of Lagrange Multipliers in Extended Nonlinear Optimization

Hong Do; Boris Mordukhovich; M. Ebrahim Sarabi

arXiv:1901.01469·math.OC·January 8, 2019

Criticality of Lagrange Multipliers in Extended Nonlinear Optimization

Hong Do, Boris Mordukhovich, M. Ebrahim Sarabi

PDF

Open Access

TL;DR

This paper investigates the criticality of Lagrange multipliers in extended nonlinear programming, providing a systematic variational analysis framework that links multiplier criticality with stability concepts using advanced second-order tools.

Contribution

It introduces the first comprehensive study of multiplier criticality in ENLP, offering verifiable characterizations and relationships with stability notions in variational analysis.

Findings

01

Characterization of critical and noncritical multipliers in ENLP

02

Relationships between multiplier criticality and stability concepts

03

Application of second-order variational analysis tools

Abstract

The paper is devoted to the study and applications of criticality of Lagrange multipliers in variational systems, which are associated with the class of problems in composite optimization known as extended nonlinear programming (ENLP). The importance of both ENLP and the concept of multiplier criticality in variational systems has been recognized in theoretical and numerical aspects of optimization and variational analysis, while the criticality notion has never been investigated in the ENLP framework. We present here a systematic study of critical and noncritical multipliers in a general variational setting that covers, in particular, KKT systems in ENLP with establishing their verifiable characterizations as well as relationships between noncriticality and other stability notions in variational analysis. Our approach is mainly based on advanced tools of second-order variational…

Equations304

\textrm{minimize }\;\varphi(x):=\varphi_{0}(x)+\theta\big{(}\Phi(x)\big{)},\quad x\in\mathbb{R}^{n},

\textrm{minimize }\;\varphi(x):=\varphi_{0}(x)+\theta\big{(}\Phi(x)\big{)},\quad x\in\mathbb{R}^{n},

\theta(u)=\theta_{Y,B}(u):=\underset{y\in Y}{\sup}{\Big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,By\rangle\Big{\}}}

\theta(u)=\theta_{Y,B}(u):=\underset{y\in Y}{\sup}{\Big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,By\rangle\Big{\}}}

\Psi(x,\lambda):=f(x)+\nabla\Phi(x)^{*}\lambda=0,\;\lambda\in\partial\theta\big{(}\Phi(x)\big{)}\;\mbox{ with }\;\theta=\theta_{Y,B},

\Psi(x,\lambda):=f(x)+\nabla\Phi(x)^{*}\lambda=0,\;\lambda\in\partial\theta\big{(}\Phi(x)\big{)}\;\mbox{ with }\;\theta=\theta_{Y,B},

f(x)+\nabla\Phi(x)^{*}\lambda=0,\;\lambda\in N_{\Theta}\big{(}\Phi(x)\big{)},

f(x)+\nabla\Phi(x)^{*}\lambda=0,\;\lambda\in N_{\Theta}\big{(}\Phi(x)\big{)},

T_{\Omega}(\bar{z}):=\Big{\{}w\in\mathbb{R}^{d}\Big{|}\;\exists\,z_{k}\xrightarrow{\Omega}\bar{z},\;\exists\,\alpha_{k}\geq 0\;\textrm{ with }\;\alpha_{k}(z_{k}-z)\rightarrow w\;\textrm{ as }\;k\rightarrow\infty\Big{\}},

T_{\Omega}(\bar{z}):=\Big{\{}w\in\mathbb{R}^{d}\Big{|}\;\exists\,z_{k}\xrightarrow{\Omega}\bar{z},\;\exists\,\alpha_{k}\geq 0\;\textrm{ with }\;\alpha_{k}(z_{k}-z)\rightarrow w\;\textrm{ as }\;k\rightarrow\infty\Big{\}},

{\rm dom}\,F:=\big{\{}x\in\mathbb{R}^{n}\big{|}\;F(x)\neq\emptyset\big{\}}\;\mbox{ and }\;\mathrm{gph}\,F:=\big{\{}(x,y)\in\mathbb{R}^{n}\times\mathbb{R}^{p}\big{|}\;y\in F(x)\big{\}}.

{\rm dom}\,F:=\big{\{}x\in\mathbb{R}^{n}\big{|}\;F(x)\neq\emptyset\big{\}}\;\mbox{ and }\;\mathrm{gph}\,F:=\big{\{}(x,y)\in\mathbb{R}^{n}\times\mathbb{R}^{p}\big{|}\;y\in F(x)\big{\}}.

DF(\bar{x},\bar{y})(u):=\big{\{}v\in\mathbb{R}^{p}\big{|}\;(u,v)\in T_{\mathrm{gph}\,F}(\bar{x},\bar{y})\big{\}},\quad u\in\mathbb{R}^{n}.

DF(\bar{x},\bar{y})(u):=\big{\{}v\in\mathbb{R}^{p}\big{|}\;(u,v)\in T_{\mathrm{gph}\,F}(\bar{x},\bar{y})\big{\}},\quad u\in\mathbb{R}^{n}.

d^{2} φ (\overset{x}{ˉ}, \overset{y}{ˉ}) (\overset{w}{ˉ}) := t ↓ 0 w \to \overset{w}{ˉ} lim inf \frac{φ ( x ˉ + tw ) - φ ( x ˉ ) - t ⟨ y ˉ , w ⟩}{\frac{1}{2} t ^{2}} .

d^{2} φ (\overset{x}{ˉ}, \overset{y}{ˉ}) (\overset{w}{ˉ}) := t ↓ 0 w \to \overset{w}{ˉ} lim inf \frac{φ ( x ˉ + tw ) - φ ( x ˉ ) - t ⟨ y ˉ , w ⟩}{\frac{1}{2} t ^{2}} .

\partial\varphi(\bar{x}):=\big{\{}v\in\mathbb{R}^{n}\big{|}\;\langle v,x-\bar{x}\rangle\leq\varphi(x)-\varphi(\bar{x})\;\mbox{ for all }\;x\in\mathbb{R}^{n}\big{\}}.

\partial\varphi(\bar{x}):=\big{\{}v\in\mathbb{R}^{n}\big{|}\;\langle v,x-\bar{x}\rangle\leq\varphi(x)-\varphi(\bar{x})\;\mbox{ for all }\;x\in\mathbb{R}^{n}\big{\}}.

N_{\Omega}(\bar{x}):=\big{\{}v\in\mathbb{R}^{n}\big{|}\;\langle v,x-\bar{x}\rangle\leq 0\;\mbox{ for all }\;x\in\Omega\big{\}}.

N_{\Omega}(\bar{x}):=\big{\{}v\in\mathbb{R}^{n}\big{|}\;\langle v,x-\bar{x}\rangle\leq 0\;\mbox{ for all }\;x\in\Omega\big{\}}.

K_{Ω} (\overset{x}{ˉ}, \overset{v}{ˉ}) := T_{Ω} (\overset{x}{ˉ}) \cap {\overset{v}{ˉ}}^{⊥}

K_{Ω} (\overset{x}{ˉ}, \overset{v}{ˉ}) := T_{Ω} (\overset{x}{ˉ}) \cap {\overset{v}{ˉ}}^{⊥}

D\partial\varphi(\bar{x},\bar{v})(u):=D\big{(}\partial\varphi\big{)}(\bar{x},\bar{v})(u),\quad u\in\mathbb{R}^{n}.

D\partial\varphi(\bar{x},\bar{v})(u):=D\big{(}\partial\varphi\big{)}(\bar{x},\bar{v})(u),\quad u\in\mathbb{R}^{n}.

Y^{\infty}:=\big{\{}y\in\mathbb{R}^{m}\big{|}\;\exists\,y_{k}\in Y,\;\exists\,\lambda_{k}\downarrow 0\;\mbox{ with }\;\lambda_{k}y_{k}\to y\big{\}}.

Y^{\infty}:=\big{\{}y\in\mathbb{R}^{m}\big{|}\;\exists\,y_{k}\in Y,\;\exists\,\lambda_{k}\downarrow 0\;\mbox{ with }\;\lambda_{k}y_{k}\to y\big{\}}.

{\rm dom}\,\theta_{Y,B}=\big{(}Y^{\infty}\cap\ker B\big{)}^{*}.

{\rm dom}\,\theta_{Y,B}=\big{(}Y^{\infty}\cap\ker B\big{)}^{*}.

\partial\theta_{Y,B}(u)=\operatornamewithlimits{arg\,max}_{y\in Y}\big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,By\rangle\big{\}}=(N_{Y}+B)^{-1}(u),\quad u\in\mathbb{R}^{m}.

\partial\theta_{Y,B}(u)=\operatornamewithlimits{arg\,max}_{y\in Y}\big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,By\rangle\big{\}}=(N_{Y}+B)^{-1}(u),\quad u\in\mathbb{R}^{m}.

{\mathrm{d}}^{2}\theta_{Y,B}(\bar{z},\bar{\lambda})(u)=2\theta_{{\cal K},B}(u):=\sup_{w\in{\cal K}}\big{\{}2\langle w,u\rangle-\langle w,Bw\rangle\big{\}},\quad u\in\mathbb{R}^{m},

{\mathrm{d}}^{2}\theta_{Y,B}(\bar{z},\bar{\lambda})(u)=2\theta_{{\cal K},B}(u):=\sup_{w\in{\cal K}}\big{\{}2\langle w,u\rangle-\langle w,Bw\rangle\big{\}},\quad u\in\mathbb{R}^{m},

D \partial θ_{Y, B} (\overset{z}{ˉ}, \overset{ˉ}{λ}) (u) = \partial θ_{K, B} (u), u \in R^{m} .

D \partial θ_{Y, B} (\overset{z}{ˉ}, \overset{ˉ}{λ}) (u) = \partial θ_{K, B} (u), u \in R^{m} .

\Lambda(\bar{x}):=\big{\{}\lambda\in\mathbb{R}^{m}\big{|}\;\Psi(\bar{x},\lambda)=0,\;\lambda\in\partial\theta_{Y,B}\big{(}\Phi(\bar{x})\big{)}\big{\}}.

\Lambda(\bar{x}):=\big{\{}\lambda\in\mathbb{R}^{m}\big{|}\;\Psi(\bar{x},\lambda)=0,\;\lambda\in\partial\theta_{Y,B}\big{(}\Phi(\bar{x})\big{)}\big{\}}.

0\in f(\bar{x})+\partial\big{(}\theta_{Y,B}\circ\Phi\big{)}(\bar{x}).

0\in f(\bar{x})+\partial\big{(}\theta_{Y,B}\circ\Phi\big{)}(\bar{x}).

0\in\nabla_{x}\Psi(\bar{x},\bar{\lambda})\xi+\nabla\Phi(\bar{x})^{*}D\partial\theta_{Y,B}\big{(}\Phi(\bar{x}),\bar{\lambda}\big{)}\big{(}\nabla\Phi(\bar{x}\big{)}\xi).

0\in\nabla_{x}\Psi(\bar{x},\bar{\lambda})\xi+\nabla\Phi(\bar{x})^{*}D\partial\theta_{Y,B}\big{(}\Phi(\bar{x}),\bar{\lambda}\big{)}\big{(}\nabla\Phi(\bar{x}\big{)}\xi).

{\nabla_{x} Ψ (\overset{x}{ˉ}, \overset{ˉ}{λ}) ξ + \nablaΦ (\overset{x}{ˉ})^{*} η = 0, ⟨ \nablaΦ (\overset{x}{ˉ}) ξ - B η, η ⟩ = 0, \nablaΦ (\overset{x}{ˉ}) ξ - B η \in K^{*}, \mbox an d η \in K

{\nabla_{x} Ψ (\overset{x}{ˉ}, \overset{ˉ}{λ}) ξ + \nablaΦ (\overset{x}{ˉ})^{*} η = 0, ⟨ \nablaΦ (\overset{x}{ˉ}) ξ - B η, η ⟩ = 0, \nablaΦ (\overset{x}{ˉ}) ξ - B η \in K^{*}, \mbox an d η \in K

D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\partial\theta_{{\cal K},B}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\partial\theta_{{\cal K},B}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

\partial\theta_{{\cal K},B}\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\big{(}N_{{\cal K}}+B\big{)}^{-1}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

\partial\theta_{{\cal K},B}\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\big{(}N_{{\cal K}}+B\big{)}^{-1}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\big{(}N_{{\cal K}}+B\big{)}^{-1}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})\big{(}\nabla\Phi(\bar{x})\xi\big{)}=\big{(}N_{{\cal K}}+B\big{)}^{-1}\big{(}\nabla\Phi(\bar{x})\xi\big{)}.

⟨ \nablaΦ (\overset{x}{ˉ}) ξ - B η, η ⟩ = 0, \nablaΦ (\overset{x}{ˉ}) ξ - B η \in K^{*}, η \in K .

⟨ \nablaΦ (\overset{x}{ˉ}) ξ - B η, η ⟩ = 0, \nablaΦ (\overset{x}{ˉ}) ξ - B η \in K^{*}, η \in K .

Y=\mathbb{R}^{m}_{+}:=\big{\{}y=(y_{1},\ldots,y_{m})\in\mathbb{R}^{m}\big{|}\;y_{i}\geq 0\;\mbox{ for all }\;i=1,\ldots,m\big{\}}.

Y=\mathbb{R}^{m}_{+}:=\big{\{}y=(y_{1},\ldots,y_{m})\in\mathbb{R}^{m}\big{|}\;y_{i}\geq 0\;\mbox{ for all }\;i=1,\ldots,m\big{\}}.

\displaystyle\theta_{\mathbb{R}^{m}_{+},I}(u)=\sup_{y\in\mathbb{R}^{m}_{+}}\Big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,y\rangle\Big{\}},\quad u\in\mathbb{R}^{m}.

\displaystyle\theta_{\mathbb{R}^{m}_{+},I}(u)=\sup_{y\in\mathbb{R}^{m}_{+}}\Big{\{}\langle y,u\rangle-\frac{1}{2}\langle y,y\rangle\Big{\}},\quad u\in\mathbb{R}^{m}.

⎩ ⎨ ⎧ λ + λ = \overset{z}{ˉ} ⟨ λ, λ ⟩ = 0 λ \in R_{+}^{m} λ \in R_{-}^{m}

⎩ ⎨ ⎧ λ + λ = \overset{z}{ˉ} ⟨ λ, λ ⟩ = 0 λ \in R_{+}^{m} λ \in R_{-}^{m}

K = T_{R_{+}^{m}} (0) \cap {\overset{z}{ˉ}}^{⊥} = R_{+}^{m} \mbox an d K^{*} = span {\overset{z}{ˉ}} + N_{R_{+}^{m}} (0) = R_{-}^{m} .

K = T_{R_{+}^{m}} (0) \cap {\overset{z}{ˉ}}^{⊥} = R_{+}^{m} \mbox an d K^{*} = span {\overset{z}{ˉ}} + N_{R_{+}^{m}} (0) = R_{-}^{m} .

⎩ ⎨ ⎧ \nabla_{x} Ψ (\overset{x}{ˉ}, \overset{ˉ}{λ}) ξ + \nablaΦ (\overset{x}{ˉ})^{*} η = 0 ⟨ \nablaΦ (\overset{x}{ˉ}) ξ - η, η ⟩ = 0 \nablaΦ (\overset{x}{ˉ}) ξ - η \in R_{-}^{m} η \in R_{+}^{m}

⎩ ⎨ ⎧ \nabla_{x} Ψ (\overset{x}{ˉ}, \overset{ˉ}{λ}) ξ + \nablaΦ (\overset{x}{ˉ})^{*} η = 0 ⟨ \nablaΦ (\overset{x}{ˉ}) ξ - η, η ⟩ = 0 \nablaΦ (\overset{x}{ˉ}) ξ - η \in R_{-}^{m} η \in R_{+}^{m}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Variational Analysis · Advanced Optimization Algorithms Research · Fractional Differential Equations Solutions

Full text

**CRITICALITY OF LAGRANGE MULTIPLIERS

IN EXTENDED NONLINEAR OPTIMIZATION**

HONG DO111Department of Mathematics, Wayne State University, Detroit, Michigan, 48202, USA ([email protected]). Research of this author was partly supported by the USA National Science Foundation under grants DMS-1512846 and DMS-1808978, and by the USA Air Force Office of Scientific Research grant #15RT04., BORIS S. MORDUKHOVICH222Department of Mathematics, Wayne State University, Detroit, Michigan, 48202, USA ([email protected]). Research of this author was partly supported by the USA National Science Foundation under grants DMS-1512846 and DMS-1808978, and by the USA Air Force Office of Scientific Research grant #15RT04. and M. EBRAHIM SARABI333Department of Mathematics, Miami University, Oxford, Ohio, 45056, USA ([email protected]).

**Abstract. **The paper is devoted to the study and applications of criticality of Lagrange multipliers in variational systems, which are associated with the class of problems in composite optimization known as extended nonlinear programming (ENLP). The importance of both ENLP and the concept of multiplier criticality in variational systems has been recognized in theoretical and numerical aspects of optimization and variational analysis, while the criticality notion has never been investigated in the ENLP framework. We present here a systematic study of critical and noncritical multipliers in a general variational setting that covers, in particular, KKT systems in ENLP with establishing their verifiable characterizations as well as relationships between noncriticality and other stability notions in variational analysis. Our approach is mainly based on advanced tools of second-order variational analysis and generalized differentiation.

Keywords Variational analysis, composite optimization, extended nonlinear programming, critical and noncritical multipliers, generalized differentiation, stability of variational systems

Mathematical Subject Classification (2000) 90C31, 49J52, 49J53

1 Introduction

One of the major goals of this paper is to study a remarkable class of optimization problems given in the following, formally unconstrained, composite format:

[TABLE]

where $\varphi_{0}\colon\mathbb{R}^{n}\to\mathbb{R}$ is an original cost function and $\Phi\colon\mathbb{R}^{n}\to\mathbb{R}^{m}$ is a constraint mapping, both are twice differentiable at the reference points unless otherwise stated, and where $\theta\colon\mathbb{R}^{m}\to\overline{\mathbb{R}}:=(-\infty,\infty]$ is an extended-real-valued function defined for all $u\in\mathbb{R}^{m}$ by the formula

[TABLE]

via a convex polyhedral set $Y:=\{y\in\mathbb{R}^{m}\arrowvert\;\langle b_{i},y\rangle\leq\alpha_{i},\;i=1,\ldots,p\}$ as well as an $m\times m$ positive-semidefinite and symmetric matrix $B$ .

Note that the unconstrained composite format (1.1) gives us a convenient representation of the constrained optimization problem to minimize the cost function $\varphi_{0}(x)$ subject to the inclusion constraint $\Phi(x)\in\Theta:=\{u\in\mathbb{R}^{m}|\;\theta(u)<\infty\}$ . In particular, conventional nonlinear programs (NLPs) with $s$ inequality constraints and $m-s$ equality constraints described by ${\cal C}^{2}$ -smooth functions can be written in the composite format (1.1), where $\theta:=\delta_{\Theta}$ is the indicator function of the polyhedron $\Theta:=\mathbb{R}^{s}_{-}\times\{0\}^{m-s}$ that is equal to [math] on $\Theta$ and to $\infty$ otherwise.

Problems of the ENLP type (1.1) with $\theta$ given by (1.2) were introduced by Rockafellar [17] under the name of extended nonlinear programs (ENLPs). It has been realized over the years that ENLPs in this form provide a suitable framework for developing both theoretical and computational aspects of optimization in broad classes of constrained problems that include stochastic programming, robust optimization, etc. The special expression (1.2) for the extended-real-valued function $\theta$ , known as the dualizing representation or the piecewise linear-quadratic penalty, is significant for the theory and applications of Lagrange multipliers in the Karush-Kuhn-Tucker (KKT) systems associated with the ENLPs under consideration.

It is not hard to check (see more details in Section 6) that KKT systems associated with local optimal solutions to ENLPs are included in the following more general class of variational systems of the subdifferential type

[TABLE]

where $f\colon\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ is a differentiable mapping while $\Phi\colon\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}$ is a twice differentiable mapping in the classical sense [18, Definition 13.1(i)], where $\theta_{Y,B}$ is taken from (1.2), where ∗ indicates the matrix transposition/adjoint operator, and where $\partial$ stands for the subdifferential of convex analysis.

The main attention of this paper is paid to a systematic study of the multiplier criticality concept (i.e., the notions of critical and noncritical Lagrange multipliers) for variational systems of type (1.3) with applications to KKT systems in ENLPs.

The notions of critical and noncritical multipliers were first introduced by Izmailov [4] for the classical KKT systems corresponding to NLPs with equality constraints described by ${\cal C}^{2}$ -smooth functions. It has been realized from the very beginning that the presence of critical multipliers plays a negative role in numerical optimization and is largely responsible for primal slow convergence in primal-dual algorithms of the Newtonian type. Further strong developments in this direction for NLPs and related variational inequalities have been done over the years, mainly by Izmailov, Solodov, and their collaborators; see, e.g., the book [5] and the survey paper [6], which is entirely devoted to critical multipliers. The criticality definitions in the above publications are heavily based on the specific structures of NLPs and related variational inequalities.

In [15], Mordukhovich and Sarabi suggested new definitions of critical and noncritical multipliers for a general class of subdifferential variational systems of type (1.3), where $\theta$ may be even a nonconvex extended-real-valued function. The given definitions in [15] are expressed via second-order generalized differential constructions of variational analysis while reduced to those from [4, 5] for the classical KKT systems corresponding to NLPs. Furthermore, for extended-real-valued convex piecewise linear (CPWL) functions $\theta$ in (1.3), which include (1.2) when $B=0$ , the definitions of critical and noncritical multipliers are expressed in [15] entirely in terms of the problem data with the subsequent characterizations of criticality and various applications to optimization and stability problems for such systems.

The quite recent paper of the same authors [16] contains counterparts of some major results from [15] with developing also novel issues on criticality for variational systems described by

[TABLE]

where $f$ and $\Phi$ are the same as in (1.3), and where $N_{\Theta}$ is the normal cone to a ${\cal C}^{2}$ -cone reducible set $\Theta\subset\mathbb{R}^{m}$ . This framework covers, in particular, KKT systems associated with general problems of (nonpolyhedral) conic programming; see, e.g., [1].

The main results of the current paper extend those from [15], obtained for CPWL functions $\theta$ , to the case of functions $\theta_{Y,B}$ defined in (1.2), which form a major class of extended-real-valued convex piecewise linear-quadratic functions in variational analysis; see [18] and Section 2 below. At the same time, the new results obtained here are completely independent from those derived for the variational system (1.4) in [15] in the case of nonpolyhedral sets $\Theta$ therein.

The basic tools of first-order and second-order generalized differentiation employed in this paper are tangentially generated, except the classical subdifferential of convex analysis. We mostly rely on the generalized differential theory in primal spaces developed by Rockafellar; see [18] and the references therein. Using these tools allows us to establish verifiable characterizations of noncritical multipliers in the general setting of (1.3), to characterize the uniqueness of Lagrange multipliers in (1.3), to ensure noncriticality for ENLPs via a new second-order optimality condition, which is employed in turn to verify the important stability property of solutions to KKT systems that is known as robust isolated calmness and is related to noncriticality. We also reveal a relationship between the isolated calmness and Lipschitz-like properties of solution maps for canonically perturbed variational systems with the piecewise linear-quadratic term (1.2).

As mentioned above, the existence of critical multipliers is a negative factor in convergence analysis, since it seems to prevent primal superlinear convergence of major primal-dual algorithms. Thus it is crucial to find verifiable conditions, expressed entirely in terms of the problem data in question, which ensure that critical multipliers corresponding to this minimizer do not arise. It is conjectured in [10], based on preliminary results for NLPs, that full stability of local minimizers in the sense of [7] rules out the appearance of critical multiplies. This conjecture was verified in [15] for polyhedral problems of type (1.1) with convex piecewise linear functions $\theta$ . Now we justify this conjecture in the general case of ENLPs with piecewise linear-quadratic functions $\theta_{Y,B}$ in form (1.2).

The rest of the paper is organized as follows. In Section 2 we present some definitions and facts from variational analysis and generalized differentiation that are broadly employed throughout the whole paper. Other variational constructions and results are recalled in those places of the subsequent sections where they are actually used.

Section 3 contains basic definitions of critical and noncritical multipliers for variational systems (1.3) involving piecewise linear-quadratic functions of type (1.2) with providing equivalent descriptions, examples, and discussions. In Section 4 we obtain new results on the relationship between the well-recognized calmness and isolated calmness properties of multiplier maps associated with the variational systems (1.3) with the piecewise linear-quadratic term (1.2) and the uniqueness of Lagrange multipliers in such systems. This is certainly of its independent interest, while the developed approach and results can be viewed as the preparation to the subsequent characterizations of noncritical multipliers in the variational systems under consideration.

Section 5 plays a central role in the paper. It establishes major characterizations of noncritical multipliers for systems (1.3) with $\theta_{Y,B}$ taken from (1.2) via a novel semi-isolated calmness property for solution maps to canonical perturbations of (1.3) and also via two new error bounds that are specific for the variational systems (1.3) with the piecewise linear-quadratic term (1.2).

Section 6 is devoted to noncritical multipliers in KKT systems associated with ENLPs for which the results of the previous sections are automatically applied with the specification of $\Psi$ in (1.3) as the $x$ -partial gradient of the appropriate Lagrangian. The main new result here, that is characteristic to the optimization framework, is a novel second-order sufficient condition for strict local minimizers, which also ensures that all the corresponding multipliers are noncritical.

In Section 7 we justify, for the case of ENLPs from (1.1) and (1.2), the aforementioned conjecture on excluding critical multipliers corresponding to a fully stable local minimizer for the given ENLP. The proof of this result is based on characterizations of noncriticality via semi-isolated calmness obtained in Section 5.

The last Section 8 provides applications of the developed characterizations of noncritical multipliers for the variational systems under consideration to the study of an important stability property of solution maps to KKT systems associated with ENLPs. This property of set-valued mappings has been recently recognized as robust isolated calmness. The results obtained above allow us to characterize robust isolated calmness via the noncriticality and uniqueness of Lagrange multipliers on one side and via the new second-order optimality condition for ENLPs on the other. Finally, we characterize the Lipschitz-like/Aubin property of solution maps to perturbed variational systems and establish its relationship with isolated calmness.

2 Preliminaries from Variational Analysis

In this section we review, based on the book [18], some basic notions of generalized differentiation in variational analysis and then recall important facts broadly used in what follows. Throughout the paper we use the standard notation of variational analysis; see [11, 18].

Given a nonempty subset $\Omega\subset\mathbb{R}^{d}$ and a point $\bar{z}\in\Omega$ , the (Bouligand-Severi) tangent/contingent cone $T_{\Omega}(z)$ to $\Omega$ at $\bar{z}$ is defined by

[TABLE]

where the symbol $z\stackrel{{\scriptstyle\Omega}}{{\to}}\bar{z}$ indicates that $z\to\bar{z}$ with $z\in\Omega$ .

For a set-valued mapping $F\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{p}$ , define its domain and graph by, respectively,

[TABLE]

The graphical derivative of $F$ at $(\bar{x},\bar{y})\in\mathrm{gph}\,F$ is given by

[TABLE]

Next we consider an extended-real-valued function $\varphi\colon\mathbb{R}^{n}\to\overline{\mathbb{R}}:=(-\infty,\infty]$ with $\bar{x}\in{\rm dom}\,\varphi:=\{x\in\mathbb{R}^{n}|\;\varphi(x)<\infty\}$ . Given $\bar{y}\in\mathbb{R}^{n}$ , the second subderivative of $\varphi$ at $(\bar{x},\bar{y})$ in the direction $\bar{w}$ is defined by

[TABLE]

When $\varphi$ is convex and proper (i.e., ${\rm dom}\,\varphi\neq\emptyset$ ), we use its subdifferential (i.e., the collection of subgradients) at $\bar{x}\in{\rm dom}\,\varphi$ given by

[TABLE]

If $\Omega\subset\mathbb{R}^{n}$ is a nonempty convex set, then the normal cone to $\Omega$ at $\bar{x}\in\Omega$ is the subdifferential (2.4) of its indicator function and thus is defined by

[TABLE]

The critical cone to $\Omega$ at $\bar{x}$ for $\bar{v}\in N_{\Omega}(\bar{x})$ is expressed via the tangent cone (2.1) as

[TABLE]

with the notation $\{\bar{v}\}^{\perp}:=\big{\{}w\in\mathbb{R}^{n}|\;\langle w,v\rangle=0\}$ .

Along with (2.3), we employ in this paper yet another second-order generalized derivative of an extended-real-valued convex function $\varphi\colon\mathbb{R}^{n}\to\overline{\mathbb{R}}$ at $\bar{x}\in{\rm dom}\,\varphi$ for $\bar{v}\in\partial\varphi(\bar{x})$ that is defined via the graphical derivative (2.2) of the subgradient mapping $\partial\varphi\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{n}$ under the name of the subgradient graphical derivative by

[TABLE]

Invoking the constructions above, we now formulate the basic facts about the functions $\theta_{Y,B}$ taken from (1.2) that are systematically exploited in the paper. The proofs of these facts can be found in [18, Examples 11.18, 13.23 and Theorem 13.40]. Recall that the horizon cone of a nonempty set $Y\subset\mathbb{R}^{m}$ used below is defined by

[TABLE]

Recall also [18, Definition 10.20] that a function $\varphi\colon\mathbb{R}^{n}\to\overline{\mathbb{R}}$ is piecewise linear-quadratic if its domain ${\rm dom}\,\varphi$ can be represented as the union of finitely many convex polyhedral sets, relative to each of which $\varphi(x)$ is given by an expression of the form $\frac{1}{2}\langle x,Ax\rangle+\langle a,x\rangle+\alpha$ for some scalar $\alpha\in\mathbb{R}$ , vector $a\in\mathbb{R}^{n}$ , and $n\times n$ symmetric matrix $A$ .

Theorem 2.1

(properties of piecewise linear-quadratic penalties).* Let $\theta_{Y,B}$ be defined by (1.2). Then the following properties hold:*

(i)

The function $\theta_{Y,B}$ is a proper and convex piecewise linear-quadratic with the domain

[TABLE]

(ii)

The subdifferential (2.4) of $\theta_{Y,B}$ is calculated by

[TABLE]

(iii)

Given any $(\bar{z},\bar{\lambda})\in\mathrm{gph}\,\partial\theta_{Y,B}$ , the second subderivative (2.3) is calculated by

[TABLE]

in the same form $\theta_{{\cal K},B}(u)$ as in (1.2) with the replacement of $Y$ by critical cone ${\cal K}:=K_{Y}(\bar{\lambda},\bar{z}-B\bar{\lambda})$ defined via (2.6). Furthermore, the subgradient graphical derivative (2.7) of $\theta_{Y,B}$ at $\bar{z}$ for $\bar{\lambda}$ is represented as

[TABLE]

3 Multiplier Criticality in Piecewise Linear-Quadratic Settings

In this section we formulate the definitions of critical and noncritical multipliers corresponding to stationary points of the variational system (1.3) with the piecewise linear-quadratic term (1.2), establish an equivalent description of criticality entirely via the given data of (1.3), and then present two examples illustrating the calculation of critical and noncritical multipliers for this setting.

Given a point $\bar{x}\in\mathbb{R}^{n}$ , define the set of Lagrange multipliers associated with $\bar{x}$ by

[TABLE]

If $(\bar{x},\bar{\lambda})$ is a solution to the variational system (1.3), we clearly get $\bar{\lambda}\in\Lambda(\bar{x})$ . Furthermore, it is not hard to check that the inclusion $\bar{\lambda}\in\Lambda(\bar{x})$ ensures that $\bar{x}$ is a stationary point of (1.3) in the sense that it satisfies the condition

[TABLE]

Suppose from now on that $\Lambda(\bar{x})\neq\emptyset$ , which is ensured, e.g., by any constraint qualification condition in problems of constrained optimization. The following definitions of critical and noncritical multipliers for (1.3), are just specifications of those from [15], given there for general variational systems with the subsequent implementation for the case of a convex piecewise linear function $\theta$ . It is worth noticing that the function $\theta$ from (1.2) with $B=0$ is convex piecewise linear, namely its epigraph is a convex polyhedral set, and so can be covered by the results already established in [15]; however, when $B\neq 0$ , it is a convex piecewise linear-quadratic function and requires different techniques to achieve similar results.

Definition 3.1

(critical and noncritical multiplies in variational systems).* Let $(\bar{x},\bar{\lambda})$ be a solution to the variational system (1.3). We say that $\bar{\lambda}\in\Lambda(\bar{x})$ is a critical Lagrange multiplier for (1.3) corresponding to $\bar{x}$ if there exists a nonzero vector $\xi\in\mathbb{R}^{n}$ such that*

[TABLE]

A given multiplier $\bar{\lambda}\in\Lambda(\bar{x})$ is noncritical for (1.3) corresponding to $\bar{x}$ if the generalized equation (3.3) admits only the trivial solution $\xi=0$ .

Applying the representations of Theorem 2.1 for the graphical derivative in (3.3) gives us an equivalent description of critical and noncritical multipliers from Definition 3.1, expressed entirely in terms of the initial data of (1.3).

Theorem 3.2

(equivalent description of criticality via piecewise linear-quadratic penalties).* Let $(\bar{x},\bar{\lambda})$ be a solution to the variational system (1.3) with the term $\theta_{Y,B}$ taken from (1.2). Denoting $\bar{z}:=\Phi(\bar{x})$ and ${\cal K}:=K_{Y}(\bar{\lambda},\bar{z}-B\bar{\lambda})$ via the critical cone (2.6), we have that the multiplier $\bar{\lambda}$ corresponding to $\bar{x}$ is critical for (1.3) if and only if the system of relationships*

[TABLE]

admits a solution $(\xi,\eta)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ with $\xi\neq 0$ . Accordingly, $\bar{\lambda}$ is a noncritical multiplier in this setting if and only if we have $\xi=0$ for any solution $(\xi,\eta)$ to (3.4).

Proof.

To achieve the claimed equivalencies, we require to calculate the graphical derivative $D\partial\theta_{Y,B}$ in (3.3) for the function $\theta_{Y,B}$ given in (1.2). First we use formula (2.10) from Theorem 2.1(iii), which yields

[TABLE]

On the other hand, the second expression of $\partial\theta_{{\cal K},B}$ in (2.8) of Theorem 2.1(ii) shows that

[TABLE]

Putting these representations together, we arrive at

[TABLE]

Picking further any vector $\eta$ from the set on the left-hand side of (3.5) gives us therefore that $\eta\in(N_{{\cal K}}+B\big{)}^{-1}\big{(}\nabla\Phi(\bar{x})\xi)$ and so $\nabla\Phi(\bar{x})\xi-B\eta\in N_{{\cal K}}(\eta)$ . Since ${\cal K}$ is a convex cone, the latter inclusion is equivalent to the conditions

[TABLE]

Finally, we substitute the obtained descriptions of $\eta\in D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})(\nabla\Phi(\bar{x})\xi)$ into (3.3) and thus clearly verify both assertions of the theorem. $\square$

Next we present two examples, which demonstrate how to use the descriptions of Theorem 3.2 to explicitly determine critical and noncritical multipliers and illustrate in this way some characteristic features of multiplier criticality.

Example 3.3

(calculating critical and noncritical multipliers).* *Consider the multidimensional case of (1.3) with $\theta_{Y,B}$ from (1.2), where $B=I_{m}=:I$ is the $m\times m$ identity matrix, and where the convex polyhedral set $Y$ is the nonnegative orthant in $\mathbb{R}^{m}$ , i.e.,

[TABLE]

Thus the function $\theta_{Y,B}$ from (1.2) reduces in this case to

[TABLE]

For any $\bar{x}\in\mathbb{R}^{n}$ and $\bar{z}:=\Phi(\bar{x})$ , by Theorem 2.1(ii) we have that $\lambda\in\partial\theta_{\mathbb{R}^{m}_{+},I}(\bar{z})$ if and only if $\bar{z}-B\lambda\in N_{\mathbb{R}^{m}_{+}}(\lambda)=\mathbb{R}^{m}_{-}\cap\lambda^{\perp}$ . Denoting $\bar{z}-\lambda$ by $\widehat{\lambda}$ , the latter inclusion is equivalent to the following system of equations and inclusions:

[TABLE]

It is not hard to see that for each fixed $\bar{x}$ and $\bar{z}=\Phi(\bar{x})$ this system has only one solution, which implies that the set of Lagrange multipliers has at most one element.

We now give two specific examples of mappings $f$ and $\Phi$ , where one has a noncritical multiplier and the other has a critical multiplier. First, let $f(x):=x$ and $\Phi(x):=(x_{1},0,\ldots,0)\in\mathbb{R}^{m}$ for all $x=(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ , and let $\bar{x}:=0\in\mathbb{R}^{n}$ . Combining (3.6) with the fact that $\Psi(\bar{x},\lambda)=(\lambda_{1},0,\ldots,0)\in\mathbb{R}^{n}$ implies that the unique Lagrange multiplier is $\bar{\lambda}=0$ . Then we calculate the critical cone ${\cal K}=K_{Y}(0,\bar{z})$ in Theorem 3.2 with $\bar{z}=\Phi(\bar{x})=0$ and its dual cone ${\cal K}^{*}$ by, respectively,

[TABLE]

It follows from Theorem 3.2 that the unique Lagrange multiplier $\bar{\lambda}=0$ is noncritical if and only if the system of equations and inclusions

[TABLE]

admits the only solution pairs $(\xi,\eta)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ with $\xi=0$ . Denoting $\zeta:=\nabla\Phi(\bar{x})\xi-\eta$ , the above system can be equivalently rewritten as

[TABLE]

Since $\nabla_{x}\Psi(\bar{x},\bar{\lambda})\xi=\xi$ , $\nabla\Phi(\bar{x})\xi=(\xi_{1},0,\ldots,0)\in\mathbb{R}^{m}$ , and $\nabla\Phi(\bar{x})^{*}\eta=(\eta_{1},0,\ldots,0)\in\mathbb{R}^{n}$ for any $\eta=(\eta_{1},\ldots,\eta_{m})\in\mathbb{R}^{m}$ , it can be easily checked that the latter system has the unique solution pair $(\xi,\eta)=(0,0)$ . This tells us that $\bar{\lambda}=0$ is a noncritical multiplier.

Next we consider the case where $\Phi(x):=(x_{1},0,\ldots,0)\in\mathbb{R}^{m}$ as before while $f(x):=(x_{1},\ldots,x_{n-1},0)\in\mathbb{R}^{n}$ for all $x=(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ . Proceeding similarly to the previous case shows that $\bar{\lambda}=0$ is the unique Lagrange multiplier with the same critical cone $\mathcal{K}$ . In this setting we have $\nabla_{x}\Psi(\bar{x},\bar{\lambda})\xi=(\xi_{1},\ldots,\xi_{n-1},0)\in\mathbb{R}^{n}$ , and therefore system (3.7) reduces to

[TABLE]

It shows that all the pairs $(\xi,\eta)$ with $\eta=0$ and $\xi=(0,\ldots,0,\xi_{n})$ for $\xi_{n}\in\mathbb{R}$ are solutions to the above system. Thus the multiplier $\bar{\lambda}=0$ is critical.

In Section 6 we revisit this example in the optimization framework; see Example 6.2.**

The next two-dimensional example presents a simple linear-quadratic variational system of type (1.3) with $\theta_{Y,B}$ from (1.2) such that a stationary point therein is associated with both critical and noncritical Lagrange multipliers.

Example 3.4

(variational systems with both critical and noncritical multipliers corresponding to a given stationary point).* *Specify the data of (1.2) and (1.3) as follows:

[TABLE]

Thus we have in (1.3) that $\Psi(x,\lambda)=f(x)+\nabla\Phi(x)^{*}\lambda=-x+2x\lambda_{2}$ for any $x\in\mathbb{R}$ and $\lambda=(\lambda_{1},\lambda_{2})\in\mathbb{R}^{2}$ . By Theorem 2.1(i), we obtain ${\rm dom}\,\theta_{Y,B}=\mathbb{R}\times\mathbb{R}_{-}$ . Since $\partial\theta_{Y,B}(u)=(N_{Y}+B)^{-1}(u)$ by Theorem 2.1(iii), it is not hard to see $\partial\theta_{Y,B}(0)=\{0\}\times\mathbb{R}_{+}$ , and so $\Lambda(\bar{x})=\{0\}\times\mathbb{R}_{+}$ with $\bar{x}:=0$ . Then for any $\lambda=(\lambda_{1},\lambda_{2})\in\Lambda(\bar{x})$ we get $\lambda_{1}=0$ and $\lambda_{2}\geq 0$ . On the other hand, conditions (3.1) from Theorem 3.2 read now as

[TABLE]

This tells us that if $\lambda_{2}\neq\frac{1}{2}$ , the latter system admits only the solution $\xi=0$ , and thus the obtained Lagrange multiplier $\lambda$ is noncritical. In the case where $\lambda_{2}=\frac{1}{2}$ , this system admits nontrivial solutions $\xi$ , and so the Lagrange multiplier $\lambda=(0,\frac{1}{2})$ is critical.**

4 Uniqueness of Lagrange Multipliers and Isolated Calmness

This section is devoted to the study of uniqueness of Lagrange multipliers corresponding to given stationary points of the variational systems (1.3) with piecewise linear-quadratic penalties (1.2). This issue is definitely of its own interest while seems to be independent of multiplier criticality. However, the methods we develop for the uniqueness study and the obtained conditions for it occur to be closely related to the subsequent characterizations of noncritical multiplies as well as their deeper understanding and specification.

First we recall some “at-point” (vs. “around/neighborhood”) stability properties of set-valued mappings that have been recognized in variational analysis; see, e.g., [3, 11, 18] with the references and commentaries therein.

It is said that a mapping $F\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{m}$ is calm at $(\bar{x},\bar{y})\in\mathrm{gph}\,F$ if there exist a constant $\ell\geq 0$ and neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ such that

[TABLE]

where $\mathbb{B}$ stands for the closed unit ball of the space in question. If (4.1) is replaced by

[TABLE]

then the corresponding property is known as isolated calmness of $F$ at $(\bar{x},\bar{y})$ . If the $\mathrm{gph}\,F$ is locally closed at $(\bar{x},\bar{y})$ , the latter property admits the graphical derivative characterization

[TABLE]

known as the Levy-Rockafellar criterion; see the commentaries to [3, Theorem 4E.1].

Finally, $F$ enjoys the robust isolated calmness property at $(\bar{x},\bar{y})$ if in addition to (4.2) we have $F(x)\cap V\neq\emptyset$ . This name is coined quite recently [2], while the property itself has been actually used in optimization over the years; see the discussions in [2, 15].

In this section we employ the calmness and isolated calmness properties for characterizations of uniqueness of Lagrange multipliers in (1.3) with the piecewise linear-quadratic term (1.2). Robust isolated calmness is used in the last section of the paper.

Using the data of (1.3), consider the set-valued mapping $G\colon\mathbb{R}^{n}\times\mathbb{R}^{m}\rightrightarrows\mathbb{R}^{n}\times\mathbb{R}^{m}$ given by

[TABLE]

Then fix a point $\bar{x}\in\mathbb{R}^{n}$ and define the parameterized multiplier map $M_{\bar{x}}\colon\mathbb{R}^{n}\times\mathbb{R}^{m}\rightrightarrows\mathbb{R}^{m}$ associated with $\bar{x}$ by

[TABLE]

We have $M_{\bar{x}}(0,0)=\Lambda(\bar{x})$ for the Lagrange multiplier set (3.1) of the unperturbed system (1.3).

The next theorem characterizes uniqueness of Lagrange multipliers in variational systems (1.3) with the term $\theta_{Y,B}$ from (1.2) via both calmness and isolated calmness properties of the multiplier map (4.5), which are equivalent to each other in this case and are characterized in turn by a novel dual qualification condition.

Theorem 4.1

(characterizations of uniqueness of Lagrange multipliers in variational systems).* Let $(\bar{x},\bar{\lambda})$ be a solution to the variational system (1.3) with $\theta_{Y,B}$ taken from (1.2). Then the following properties are equivalent:*

(i)

$\Lambda(\bar{x})=\{\bar{\lambda}\}$ .

(ii)

$M_{\bar{x}}$ * is calm at $\big{(}(0,0),\bar{\lambda}\big{)}$ and $\Lambda(\bar{x})=\{\bar{\lambda}\}$ .*

(iii)

$M_{\bar{x}}$ * is isolatedly calm at $\big{(}(0,0),\bar{\lambda}\big{)}$ .*

(iv)

We have the dual qualification condition

[TABLE]

where $D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})$ is calculated by (3.5).

Proof.

Denoting $\bar{z}:=\Phi(\bar{x})$ as above, we begin with proving the equivalence (iii) $\Longleftrightarrow$ (iv). To proceed, observe that the graph of $M_{\bar{x}}$ is closed and deduce from (4.3) that $M_{\bar{x}}$ is isolatedly calm at $((0,0),\bar{\lambda})$ if and only if $DM_{\bar{x}}\big{(}(0,0),\bar{\lambda}\big{)}(0,0)=\{0\}$ . It is not hard to check that $\eta\in DM_{\bar{x}}\big{(}(0,0),\bar{\lambda}\big{)}(0,0)$ amounts to saying that $\eta$ is a solution to the system

[TABLE]

This tells us that $\eta$ is a solution to the above system if and only if

[TABLE]

Combining these facts verifies the equivalence between conditions (iii) and (iv).

Next we show that (i) $\Longrightarrow$ (iv). Assume on the contrary that the dual qualification condition (4.6) fails while (i) holds, and so find an element

[TABLE]

Since $\Psi(\bar{x},\bar{\lambda}+t\eta)=0$ for any $t>0$ , we get from $\eta\in D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})(0)$ and (2.10) that $\eta\in\partial\theta_{{\cal K},B}(0)$ , and hence $-B\eta\in N_{{\cal K}}(\eta)$ by Theorem 2.1(ii). Choosing $t$ to be sufficiently small and employing the Reduction Lemma from [3, Lemma 2E.4] ensure the existence of a neighbored $U$ of $(0,0)\in\mathbb{R}^{m}\times\mathbb{R}^{m}$ such that

[TABLE]

This in turn results in $\bar{z}-B\bar{\lambda}-tB\eta\in N_{Y}(\bar{\lambda}+t\eta)$ , which yields by (2.8) the inclusion $\bar{\lambda}+t\eta\in\partial\theta_{Y,B}(\bar{z})$ . Combining the latter with $\Psi(\bar{x},\bar{\lambda}+t\eta)=0$ results in $\bar{\lambda}+t\eta\in\Lambda(\bar{x})$ . However, we have $\eta\neq 0$ thus $\bar{\lambda}+t\eta\neq\bar{\lambda}$ for any $t>0$ , which contradicts (i) and so verifies the claimed implication (i) $\Longrightarrow$ (iv).

To show further that the isolated calmness of $M_{\bar{x}}$ at $\big{(}(0,0),\bar{\lambda}\big{)}$ imposed in (iii) yields (ii), it suffices to check that $\Lambda(\bar{x})=\{\bar{\lambda}\}$ . Indeed, the assumed isolated calmness allows us to find a neighborhood $O$ of $\bar{\lambda}$ such that $M_{\bar{x}}(0,0)\cap O=\{\bar{\lambda}\}$ , which tells us by the convex-valuedness of $M_{\bar{x}}$ that $M_{\bar{x}}(0,0)=\{\bar{\lambda}\}$ . Combining the latter with $M_{\bar{x}}(0,0)=\Lambda(\bar{x})$ verifies (ii). Since (ii) obviously implies (i), we complete the proof of the theorem. $\square$

The next example reveals that the dual qualification condition (4.6) is essential for the uniqueness of Lagrange multipliers in Theorem 4.1.

Example 4.2

(nonuniqueness of Lagrange multipliers under failure of the dual qualification condition).* *Consider the variational system (1.3) with term (1.2), where $Y$ and $B$ are taken from (3.8), while $\Phi\colon\mathbb{R}^{2}\to\mathbb{R}^{2}$ is defined by $\Phi(x_{1},x_{2}):=(x_{1},0)$ and $f\colon\mathbb{R}^{2}\to\mathbb{R}^{2}$ is defined by $f(x)=0$ for all $x\in\mathbb{R}^{2}$ . It is shown in Example 3.4 that ${\rm dom}\,\theta_{Y,B}=\mathbb{R}\times\mathbb{R}_{-}$ . Letting $\bar{x}:=(0,0)$ , we get by the direct calculation that

[TABLE]

and so $\Lambda(\bar{x})=\{0\}\times\mathbb{R}_{+}$ , which is not a singleton.

Let us now show that the dual qualification condition fails in this setting. Having $\ker\nabla\Phi(\bar{x})^{*}=\{0\}\times\mathbb{R}$ and choosing $\bar{\lambda}:=(0,0)$ give us the critical cone

[TABLE]

and so $\partial\theta_{{\cal K},B}(0,0)=\{0\}\times\mathbb{R}_{+}$ . Combining it with (2.10), we arrive at

[TABLE]

which demonstrates the failure of the dual qualification condition (4.6).**

5 Characterizations of Noncritical Multipliers

In this section we derive major characterizations of noncritical multipliers for the piecewise linear-quadratic variational systems (1.3) in terms of semi-isolated calmness and error bounds.

Using the mapping $G$ from (4.4), define the solution map $S\colon\mathbb{R}^{n}\times\mathbb{R}^{m}\rightrightarrows\mathbb{R}^{n}\times\mathbb{R}^{m}$ for the canonical perturbation of system (1.3) by

[TABLE]

The property of semi-isolated calmness used in (5.3) was introduced in [15] for solution maps to general variational systems with a product structure of values as in (5.1). The reader can see that for such mappings the semi-isolated calmness of the variational systems of type (1.3) occupies an intermediate position between the calmness and isolated calmness.

In what follows we use the notation ${\rm dist}(x;\Omega)$ for the distance between a point $x\in\mathbb{R}^{n}$ and a set $\Omega\subset\mathbb{R}^{n}$ , $\mathbb{B}_{\varepsilon}(x)$ for the closed ball centered at $x\in\mathbb{R}^{n}$ with radius $\varepsilon>0$ , and

[TABLE]

for the proximal mapping $P\varphi\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{n}$ associated with a function $\varphi\colon\mathbb{R}^{n}\to\overline{\mathbb{R}}$ .

Theorem 5.1

(major characterizations of noncritical multipliers in variational systems).* Let $(\bar{x},\bar{\lambda})$ be a solution to the variational system (1.3) with the piecewise linear-quadratic term (1.2). Then the following conditions are equivalent:*

(i)

The Lagrange multiplier $\bar{\lambda}$ is noncritical for (1.3) corresponding to $\bar{x}$ .

(ii)

There exist numbers $\varepsilon>0$ , $\ell\geq 0$ and neighborhoods $U$ of $0\in\mathbb{R}^{n}$ and $W$ of $0\in\mathbb{R}^{m}$ such that for any $(p_{1},p_{2})\in U\times W$ the following inclusion holds:

[TABLE]

(iii)

There exist numbers $\varepsilon>0$ and $\ell\geq 0$ such that the error bound estimate

[TABLE]

holds for any $(x,\lambda)\in\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ in terms of the inverse subdifferential of $\theta_{Y,B}$ .

(iv)

There are numbers $\varepsilon>0$ and $\ell\geq 0$ such that the error bound estimate

[TABLE]

holds for any $(x,\lambda)\in\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ in terms of the proximal mapping $P\theta_{Y,B}$ from (5.2).

Proof.

Let us first verify that (ii) implies (i). Theorem 3.2 reduces it to proving that the semi-isolated calmness property in (ii) ensures that for any solution $(\xi,\eta)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ to the system (3.4) we have $\xi=0$ . Define $(x_{t},\lambda_{t}):=(\bar{x}+t\xi,\bar{\lambda}+t\eta)$ for all $t>0$ and observe that

[TABLE]

whenever $t$ is sufficiently small. Letting $p_{1t}:=\Psi(x_{t},\lambda_{t})$ and using $\Psi(\bar{x},\bar{\lambda})=0$ , we deduce from the last equality above that $p_{1t}=o(t)$ . It follows in the similar way that

[TABLE]

Denoting further $z_{t}:=\Phi(\bar{x})+t\nabla\Phi(\bar{x})\xi$ implies that

[TABLE]

and therefore we get $p_{2t}=o(t)$ for $p_{2t}:=z_{t}-\Phi(x_{t})$ .

Let us now prove that $(x_{t},\lambda_{t})\in S(p_{1t},p_{2t})$ for $t>0$ sufficiently small. Since $p_{1t}=\Psi(x_{t},\lambda_{t})$ , we only need to verify by Theorem 2.1(ii) that

[TABLE]

To proceed with checking (5.5), deduce from (3.4) that

[TABLE]

Denoting $\lambda_{t}:=\bar{\lambda}+t\eta$ and remembering that $Y$ is a convex polyhedral set, we conclude that $\lambda_{t}\in Y$ for all $t>0$ sufficiently small. Furthermore, it follows from (3.4) that

[TABLE]

Thus there exist $\alpha\in\mathbb{R}$ and $w\in N_{Y}(\bar{\lambda})$ such that $\nabla\Phi(\bar{x})\xi-B\eta=\alpha(\bar{z}-B\bar{\lambda})+w$ . Using this together with (3.4) gives us the equalities

[TABLE]

Recall that $N_{Y}(\bar{\lambda})=\{\sum_{i\in I(\bar{\lambda})}\beta_{i}b_{i}|\;\beta_{i}\geq 0\}$ , where $I(\bar{\lambda})$ stands for the set of active constraints in $Y$ at $\bar{\lambda}$ . It allows us to deduce from the inclusion $w\in N_{Y}(\bar{\lambda})$ that there are numbers $\beta_{i}\geq 0$ as $i\in I(\bar{\lambda})$ such that $w=\sum_{i\in I(\bar{\lambda})}\beta_{i}b_{i}$ , and therefore

[TABLE]

Observe furthermore the relationships

[TABLE]

where $1+t\alpha>0$ for small $t>0$ . Since both $\bar{z}-B\bar{\lambda}$ and $w$ belong to $N_{Y}(\bar{\lambda})$ , it follows that $(1+t\alpha)(\bar{z}-B\bar{\lambda})+tw\in N_{Y}(\bar{\lambda})$ , and thus there is $\tau_{it}\geq 0$ for $i\in I(\bar{\lambda})$ such that $z_{t}-B\lambda_{t}=\sum_{i\in I(\bar{\lambda})}\tau_{it}b_{i}$ . Noting that $\langle z_{t}-B\lambda_{t},\eta\rangle=0$ and $\langle b_{i},\eta\rangle\leq 0$ for all $i\in I(\bar{\lambda})$ , we deduce that

[TABLE]

Let us now show that

[TABLE]

Suppose on the contrary that there is an index $i_{0}\in I(\bar{\lambda})\setminus I(\lambda_{t})$ for which $\tau_{i_{0}t}>0$ . This means that $\langle b_{i_{0}},\bar{\lambda}\rangle=\alpha_{i_{0}}$ and $\langle b_{i_{0}},\lambda_{t}\rangle<\alpha_{i_{0}}$ . Therefore

[TABLE]

which in turn yields $\langle b_{i_{0}},\eta\rangle<0$ , a contradiction with (5.6). Thus for all $i\in I(\bar{\lambda})\setminus I(\lambda_{t})$ we get $\tau_{it}=0$ and hence arrive at

[TABLE]

This verifies (5.5) and thus implies that $(x_{t},\lambda_{t})\in S(p_{1t},p_{2t})$ . It now follows from the assumed semi-isolated calmness (5.3) in (ii) that

[TABLE]

which results in $\xi=0$ by letting $t\downarrow 0$ . It tells us $\bar{\lambda}$ is noncritical and hence justify the implication (ii) $\implies$ (i) of the theorem.

Next we prove the opposite implication (i) $\Longrightarrow$ (ii). Assuming that the multiplier noncriticality in (i) holds, let us first verify the following statement.

Claim: There exist numbers $\varepsilon>0$ and $\ell\geq 0$ and neighborhoods $U$ of $0\in\mathbb{R}^{n}$ and $W$ of $0\in\mathbb{R}^{m}$ such that for any $(p_{1},p_{2})\in U\times W$ and $(x_{p_{1}p_{2}},\lambda_{p_{1}p_{2}})\in S(p_{1},p_{2})\cap\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ we have

[TABLE]

To justify this claim, suppose on the contrary that (5.7) fails and thus for any $k\in\mathbb{N}$ find $(p_{1k},p_{2k})\in\mathbb{B}_{1/k}(0)\times\mathbb{B}_{1/k}(0)$ , $k\in\mathbb{N}$ , and $(x_{k},\lambda_{k})\in S(p_{1k},p_{2k})\cap\mathbb{B}_{1/k}(\bar{x},\bar{\lambda})$ such that

[TABLE]

Denote $t_{k}:=\|x_{k}-\bar{x}\|$ and deduce from the convergence above that $p_{1k}=o(t_{k})$ and $p_{2k}=o(t_{k})$ . Since $\theta_{Y,B}$ is a convex piecewise linear-quadratic function, it follows from the proof of [18, Theorem 11.14(b)] that $\mathrm{gph}\,\partial\theta_{Y,B}$ is a union of finitely many convex polyhedral sets. This together with [3, Theorem 3D.1] and $\bar{z}:=\Phi(\bar{x})\in{\rm dom}\,\partial\theta_{Y,B}$ ensures the existence of a number $\ell^{\prime}\geq 0$ and a neighborhood $O$ of $\bar{z}$ such that for all $z\in O\cap{\rm dom}\,\partial\theta_{Y,B}$ we have

[TABLE]

Suppose without loss of generality that $z_{k}:=p_{2k}+\Phi(x_{k})\in O$ for all $k\in\mathbb{N}$ . Since $\lambda_{k}\in\partial\theta_{Y,B}(z_{k})$ , there exist $\lambda\in\partial\theta_{Y,B}(\bar{z})$ and $b\in\mathbb{B}$ such that $\lambda_{k}=\lambda+\ell^{\prime}\left\lVert z_{k}-\bar{z}\right\rVert b$ . Using this along with the classical Hoffman lemma, we find a number $M\geq 0$ such that

[TABLE]

where $\rho$ is a common calmness constant for the mappings $f$ , $\Phi$ , and $\nabla\Phi$ at $\bar{x}$ . Since $\Lambda(\bar{x})$ is closed and convex, for each $k\in\mathbb{N}$ there exists a vector $\mu_{k}\in\Lambda(\bar{x})$ for which

[TABLE]

Thus we can assume without loss of generality that

[TABLE]

By passing to a subsequence if necessary, it follows that

[TABLE]

Due to $\mu_{k}\in\Lambda(\bar{x})$ and the discussions above we get the equalities

[TABLE]

which lead us as $k\to\infty$ to the limiting condition

[TABLE]

It further follows from $(x_{k},\lambda_{k})\in S(p_{1k},p_{2k})$ that $\lambda_{k}\in\partial\theta_{Y,B}(z_{k})$ , which is equivalent to the inclusion $z_{k}-B\lambda_{k}\in N_{Y}(\lambda_{k})$ for each $k\in\mathbb{N}$ by Theorem 2.1(ii). Since $Y$ is a convex polyhedral set, the Reduction Lemma from [3, Lemma 2E.4]) tells us that

[TABLE]

for all $k\in\mathbb{N}$ sufficiently large, where ${\cal K}$ is the critical cone to $Y$ at $\bar{z}$ for $\bar{z}-B\bar{\lambda}$ taken from Theorem 2.1(iii). This along with Theorem 2.1(iii) brings us to the conclusions

[TABLE]

which imply in turn that $\displaystyle\frac{z_{k}-\bar{z}}{t_{k}}\in{\rm dom}\,\partial\theta_{{\cal K},B}$ . Since ${\cal K}$ is a convex polyhedral set, it follows from Theorem 2.1(i) that $\theta_{{\cal K},B}$ is a convex piecewise linear-quadratic function. Thus [18, Proposition 10.21] tells us that ${\rm dom}\,\partial\theta_{{\cal K},B}={\rm dom}\,\theta_{{\cal K},B}$ . Employing Theorem 2.1(i) ensures that ${\rm dom}\,\theta_{{\cal K},B}$ is a closed set. Combining it with the convergence $\displaystyle\frac{z_{k}-\bar{z}}{t_{k}}\rightarrow\nabla\Phi(\bar{x})\xi$ as $k\rightarrow\infty$ yields

[TABLE]

Since $\mu_{k}\in\Lambda(\bar{x})$ , we get $\mu_{k}\in\partial\theta_{Y,B}(\bar{z})$ and, proceeding similarly to the proof of (5.14), arrive at

[TABLE]

Furthermore, it follows from $\bar{\lambda}\in\Lambda(\bar{x})$ and $\mu_{k}\in\Lambda(\bar{x})$ that $\bar{\lambda}-\mu_{k}\in\ker\nabla\Phi(\bar{x})^{*}$ . Using (5.15) and arguing as in the proof of (5.8), we find $\ell^{\prime}\geq 0$ and a neighborhood $O$ of $\nabla\Phi(\bar{x})\xi$ such that

[TABLE]

for all $u\in O\cap{\rm dom}\,\partial\theta_{{\cal K},B}$ . Employing the latter together with (5.14) leads us to the relationships

[TABLE]

This allows us to find, for all $k\in\mathbb{N}$ sufficiently large, a $b_{k}\in\mathbb{B}$ such that

[TABLE]

We can see that the left-hand side of inclusion (5.16) converges as $k\to\infty$ to the vector $\widetilde{\eta}$ . On the other hand, the right-hand side of this inclusion is the sum of two convex polyhedral sets, and so is closed. This shows that $\widetilde{\eta}$ satisfies to

[TABLE]

Thus we get vectors $\eta\in\partial\theta_{{\cal K},B}(\nabla\Phi(\bar{x})\xi)$ and $\eta^{\prime}\in\ker\nabla\Phi(\bar{x})^{*}\cap\partial\theta_{{\cal K},B}(0)$ , which provide the representation $\widetilde{\eta}=\eta-\eta^{\prime}$ . It follows from the relationship (2.10) in Theorem 2.1(iii) that $\eta\in D\partial\theta_{Y,B}(\bar{z},\bar{\lambda})(\nabla\Phi(\bar{x})\xi)$ . Furthermore, employing (5.13) tells us that

[TABLE]

which contradicts the noncriticality of $\bar{\lambda}$ due to $\xi\neq 0$ and thus completes the proof of the claim.

To finalize verifying implication (i) $\Longrightarrow$ (ii) in the theorem, take the neighborhoods $U$ and $W$ from the above claim and shrink them if necessary for the subsequent procedure. Using the claim and arguing similarly to the proof of the conditions in (5.12) give us a constant $\ell^{\prime}\geq 0$ such that for any $(p_{1},p_{2})\in U\times W$ and any $(x_{p_{1}p_{2}},\lambda_{p_{1}p_{2}})\in S(p_{1},p_{2})\cap\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ we have

[TABLE]

Combining it with (5.7) allows us to find $\ell\geq 0$ for which $(p_{1},p_{2})\in U\times W$ and

[TABLE]

whenever $(x_{p_{1}p_{2}},\lambda_{p_{1}p_{2}})\in S(p_{1},p_{2})\cap\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ . This clearly justifies the semi-isolated calmness property (5.3) and thus finishes the proof of implication (i) $\implies$ (ii).

The equivalence between (ii) and (iii) can be verified similarly to the corresponding arguments in the proof of [15, Theorem 4.1], and so we omit them here. Thus it remains to establish the equivalence between assertions (ii) and (iv) of the theorem to complete its proof.

Let us start with checking implication (iv) $\Longrightarrow$ (ii). Picking $(p_{1},p_{2})\in\mathbb{B}_{\varepsilon}(0,0)$ and $(x,\lambda)\in S(p_{1},p_{2})\cap\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ with $\varepsilon$ and $\ell$ taken from (iv), we get from the definition of $S$ that

[TABLE]

It follows from [18, Proposition 12.19] due to the convexity of $\theta_{Y,B}$ that $P\theta_{Y,B}=(I+\partial\theta_{Y,B})^{-1}$ , and hence the second inclusion in (5.19) is equivalent to the equality $P\theta_{Y,B}(\lambda+\Phi(x)+p_{2})=\Phi(x)+p_{2}$ . Appealing now to (5.4) brings us to the estimates

[TABLE]

which readily justify the assertion in (ii).

Finally, we verify the converse implication (ii) $\Longrightarrow$ (iv). To proceed, pick $(x,\lambda)\in\mathbb{B}_{\varepsilon/2}(\bar{x},\bar{\lambda})$ , where $\varepsilon$ is taken from (ii). Define the vectors

[TABLE]

Since $\Phi$ and $\nabla\Phi$ are continuous at $\bar{x}$ and since $P\theta_{Y,B}$ is Lipschitz continuous, we assume without loss of generality that $(p_{1},p_{2})\in\mathbb{B}_{\varepsilon/2}(0,0)$ and $\mathbb{B}_{\varepsilon/2}(0,0)\subset U\times W$ , where $U$ and $W$ come from (ii). It follows from (5.20) that $(x,\lambda-p_{2})\in S(p_{1},p_{2})\cap\mathbb{B}_{\varepsilon}(\bar{x},\bar{\lambda})$ . Since $\nabla\Phi$ is continuous at $\bar{x}$ , we can assume without loss generality that for some $\rho>0$ we have $\|\nabla\Phi(x)\|\leq\rho$ for all $x\in\mathbb{B}_{\varepsilon}(\bar{x})$ . So we deduce from (5.3) that

[TABLE]

Recall that the distance function ${\rm dist}\big{(}\cdot;\Lambda(\bar{x})\big{)}$ is Lipschitz continuous; so we have

[TABLE]

which in combination with the obtained inequalities leads us to

[TABLE]

This verifies (iv) and completes the proof of the theorem. $\square$

To conclude this section, let us mention some connection of the obtained characterizations of noncritical multipliers for variational systems (1.3) with the uniqueness of Lagrange multipliers therein, which is not assumed in Theorem 5.1. Indeed, looking more closely at the proof of theorem reveals that the second term in (5.17) is actually undesired, since it provides complications for the proof. But, as follows from Theorem 4.1, this terms disappears (reduces to $\{0\}$ ) if the set of Lagrange multipliers $\Lambda(\bar{x})$ is a singleton. This phenomenon has been recently observed in [16] for the case of constrained optimization problems.

6 Noncriticality in Extended Nonlinear Programming

Here we concentrate on problems of composite optimization given by (1.1), where $\theta=\theta_{Y,B}$ is taken from (1.2). It means that we are dealing with the class of ENLPs discussed in Section 1. Starting with this section we assume that $\varphi_{0}$ and $\Phi$ are not just twice differentiable, but belongs to the class of ${\cal C}^{2}$ -smooth mappings around the points in question.

Define the Lagrangian of (1.1) by

[TABLE]

and observe that the KKT system for (1.1) is written as

[TABLE]

Thus (6.2) is a particular case of (1.3) with $\Psi:=\nabla_{x}L$ . Denoting

[TABLE]

the corresponding set of Lagrange multipliers, we have Definition 3.1 of multiplier criticality as well as all the above results being specified for the KKT system (6.2).

On the other hand, there are some phenomena concerning critical and noncritical Lagrange multipliers that distinguish KKT systems in optimization from general variational systems of type (1.3). We consider them in this and two subsequent sections.

The following theorem provides a certain second-order sufficient condition ensuring simultaneously the strict minimality of a feasible solution to ENLP (1.1) and the noncriticality of the corresponding Lagrange multiplier. In its formulation we use the critical cone ${\cal K}$ defined in Theorem 2.1(iii) as well as the notation ${\rm rge\,}A$ for the range of a linear operator $A$ . Note that the existence of Lagrange multipliers corresponding to $\bar{x}$ in (1.1), which is assumed below, is ensured by the first-order qualification condition (7.3) from Lemma 7.1.

Theorem 6.1

(second-order sufficient condition for strict local minimizers and noncritical multipliers in ENLPs).* Let $(\bar{x},\bar{\lambda})$ be a solution to KKT system (6.2). Assume further that the second-order sufficient condition*

[TABLE]

holds. Then there exist numbers $\varepsilon>0$ and $\ell\geq 0$ such that the quadratic lower estimate

[TABLE]

holds for the function $\varphi$ taken from (1.1). Furthermore, the Lagrange multiplier $\bar{\lambda}$ satisfying (6.4) is noncritical for the KKT system (6.2) corresponding to $\bar{x}$ .

Proof.

Define the family of second-order difference quotients for $\varphi$ at $\bar{x}$ for $\bar{y}\in\mathbb{R}^{n}$ by

[TABLE]

Set $\bar{y}:=0\in\mathbb{R}^{n}$ and deduce from $\bar{\lambda}\in\Lambda_{\mathrm{com}}(\bar{x})$ that $\bar{y}=\nabla\varphi_{0}(\bar{x})+\nabla\Phi(\bar{x})^{*}\bar{\lambda}$ . Then for any $w\in\mathbb{R}^{n}$ we get the equalities

[TABLE]

where $w_{t}:=\nabla\Phi(\bar{x})w+\frac{t}{2}\langle\nabla^{2}\Phi(\bar{x})w,w\rangle+\frac{o(t^{2})}{t}$ . It implies together with (2.3) and (2.9) that

[TABLE]

Theorem 2.1(i) tells us that ${\rm dom}\,\theta_{{\cal K},B}=({\cal K}\cap\ker B)^{*}={\cal K}^{*}+{\rm rge\,}B$ . This means that the inclusion $\nabla\Phi(\bar{x})w\in{\cal K}^{*}+{\rm rge\,}B$ amounts to $\nabla\Phi(\bar{x})w\in{\rm dom}\,\theta_{{\cal K},B}$ . Employing the second-order sufficient condition (6.4) together with (6.10) ensures that ${\mathrm{d}}^{2}\varphi(\bar{x},0)(w)>0$ for all such vectors $w\in\mathbb{R}^{n}\setminus\{0\}$ . Otherwise, we have $\nabla\Phi(\bar{x})w\notin{\rm dom}\,\theta_{{\cal K},B}$ , and hence $\theta_{{\cal K},B}(\nabla\Phi(\bar{x})w)=\infty$ . This along with (6.10) results in

[TABLE]

Combining all the above brings us to

[TABLE]

Appealing now to [18, Theorem 13.24] guarantees the existence of numbers $\varepsilon>0$ and $\ell\geq 0$ for which the quadratic estimate (6.5) holds and so ensures that $\bar{x}$ is a strict local minimizer for $\varphi$ .

Finally, we verify that a multiplier $\bar{\lambda}$ satisfying the second-order condition (6.4) is noncritical for (6.2). To see it, pick $(\xi,\eta)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ fulfilling (3.4) with $\Psi=\nabla_{x}L$ , i.e., so that

[TABLE]

It follows from $\nabla\Phi(\bar{x})\xi-B\eta\in{\cal K}^{*}$ and the discussion above that $\nabla\Phi(\bar{x})\xi\in{\rm dom}\,\theta_{{\cal K},B}$ and that $\eta\in\partial\theta_{{\cal K},B}(\nabla\Phi(\bar{x})\xi)$ . Employing the subdifferential expression in (2.8) gives us

[TABLE]

In this way we arrive at the equalities

[TABLE]

which yield $\xi=0$ due to (6.4) as well as to $\nabla\Phi(\bar{x})\xi\in{\rm dom}\,\theta_{{\cal K},B}={\cal K}^{*}+{\rm rge\,}B$ . This shows that $\bar{\lambda}$ is a noncritical multiplier of (6.2) corresponding to $\bar{x}$ and thus completes the proof. $\square$

The next example, which revisits Example 3.3 in the ENLP framework, illustrates the possibility to use the second-order sufficient condition (6.4) to justify the strict optimality of a feasible solution to (1.1) and the noncriticality of the corresponding Lagrange multiplier.

Example 6.2

(multiplier noncriticality via the second-order sufficient condition).* *Consider the ENLP from (1.1), where $m=n$ , $\varphi_{0}(x):=x_{1}^{2}+\ldots+x_{n}^{2}$ and $\Phi(x):=x$ , and where $Y$ and $B$ are taken from Example 3.3. Then we have

[TABLE]

Let us check that condition (6.4) holds when $\bar{x}=0$ and $\bar{\lambda}=0$ , which confirms by Theorem 6.1 that $\bar{x}$ is a strict minimizer for this ENLP and $\bar{\lambda}$ is the corresponding noncritical multiplier. Indeed, it follows from Example 3.3 that $\bar{\lambda}\in\partial\theta(\bar{z})$ , where $\bar{z}:=\Phi(\bar{x})=0$ . By the structure of $L(x,\lambda)$ we have the expressions

[TABLE]

Then $\nabla_{x}L(\bar{x},\bar{\lambda})=0$ and hence $\bar{\lambda}\in\Lambda_{\mathrm{com}}(\bar{x})$ . Since ${\rm rge\,}B=\mathbb{R}^{n}$ , it follows that $\{w|\;\nabla\Phi(\bar{x})w\in\mathcal{K}^{*}+{\rm rge\,}B\}=\mathbb{R}^{n}$ , and therefore the sufficient condition in Theorem 6.1 reads as

[TABLE]

which is equivalently presented by

[TABLE]

Furthermore, Example 3.3 tells us that $\mathcal{K}=\mathbb{R}^{n}_{+}\cap\{\bar{z}\}^{\perp}$ and so $\mathcal{K}=\mathbb{R}^{n}_{+}=Y$ . Combining this with (6.11), the sufficient condition (6.4) now becomes

[TABLE]

Since $\theta_{Y,B}$ from (6.11) is always nonnegative, condition (6.13) holds, and thus it confirms the strict minimality of $\bar{x}$ and the noncriticality of $\bar{\lambda}$ .**

7 Critical Multipliers and Full Stability of Minimizers in ENLPs

This section also deals with constrained minimization problems of the ENLP type and delivers as important message for both theoretical and numerical aspects of optimization. As discussed in Section 1, critical multipliers are particularly responsible for slow convergence of major primal-dual algorithms of optimization and are desired to be excluded for a given local minimizer. It is natural to suppose that seeking not arbitrary while just “nice” and stable in some sense local minimizers allows us to rule out the appearance of critical multipliers associated with such local optimal solutions. It is conjectured in [10] that fully stable local minimizers in the sense of [7] are appropriate candidate for excluding critical multipliers. This conjecture is affirmatively verified in [14] for problems (1.1) with $\theta=\theta_{Y,B}$ where $B=0$ . Now we are able to extend this result to the general case of (1.2) with an arbitrary symmetric positive-semidefinite matrix $B$ .

To proceed, we first specify the definition of fully stable local minimizers from [7] for problems (1.1) with term (1.2). Consider their canonically perturbed version described by

[TABLE]

with parameter pairs $(p_{1},p_{2})\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ . Fix $\gamma>0$ and $(\bar{x},\bar{p}_{1},\bar{p}_{2})$ with $\Phi(\bar{x})+\bar{p}_{2}\in{\rm dom}\,\theta$ and then define the parameter-depended optimal value function for (7.1) by

[TABLE]

together with the parameterized set of optimal solutions to (7.1) given by

[TABLE]

with the convention that $\operatornamewithlimits{arg\,min}:=\emptyset$ when the expression under minimization in (7.2) is $\infty$ . We say that $\bar{x}$ is a fully stable local optimal solution to problem (1.1) if there exist a number $\gamma>0$ and neighborhoods $U$ of $\bar{p}_{1}$ and $W$ of $\bar{p}_{2}$ such that the mapping $(p_{1},p_{2})\mapsto M_{\gamma}(p_{1},p_{2})$ is single-valued and Lipschitz continuous with $M_{\gamma}(\bar{p}_{1},\bar{p}_{2})=\{\bar{x}\}$ and that the function $(p_{1},p_{2})\mapsto m_{\gamma}(p_{1},p_{2})$ is likewise Lipschitz continuous on $U\times W$ .

Note that [7, Proposition 3.5] deduces the local Lipschitz continuity of $m_{\gamma}$ from the basic constraint qualification (7.3) formulated in the following lemma, which is obtained in [18, Exercise 13.26]. The second-order necessary condition presented below can be viewed as a “no-gap” version of the second-order sufficient one used in Theorem 6.1 with the notation therein.

Lemma 7.1

(second-order necessary optimality condition for composite optimization problems).* Let $\bar{x}$ be a local optimal solution to problem (1.1) with $\theta=\theta_{Y,B}$ taken from (1.2), and let the basic constraint qualification*

[TABLE]

be satisfied, and so $\Lambda_{\mathrm{com}}(\bar{x})\neq\emptyset$ . Then we have second-order necessary optimality condition

[TABLE]

valid for all $w\in\mathbb{R}^{n}$ with $\nabla\Phi(\bar{x})w\in{\cal K}^{*}+{\rm rge\,}B$ .

Now we are ready to establish the aforementioned result in the general ENLP setting.

Theorem 7.2

(excluding critical multipliers by full stability of local minimizers).* Let $\bar{x}$ be a fully stable local optimal solution to problem (1.1), and let $\theta$ be taken from (1.2). Then the Lagrange multiplier set $\Lambda_{\mathrm{com}}(\bar{x})$ in (6.3) is nonempty and does not include critical multipliers.*

Proof.

First we show that the full stability of $\bar{x}$ ensures the validity of the qualification condition (7.3). Indeed, pick any $\eta\in N_{\scriptsize{{\rm dom}\,\theta_{Y,B}}}(\Phi(\bar{x}))\cap\ker\nabla\Phi(\bar{x})^{*}$ . Select $p_{1}=\bar{p}_{1}:=0$ and $p_{2}:=t\eta$ as $t\downarrow 0$ . It follows from the full stability of $\bar{x}$ that there exist a Lipschitz constant $\ell\geq 0$ and the unique solution $x_{p_{1}p_{2}}$ to problem (7.1) such that

[TABLE]

Since $\Phi(x_{p_{1}p_{2}})+p_{2}\in{\rm dom}\,\theta_{Y,B}$ and $\eta\in N_{\scriptsize{{\rm dom}\,\theta_{Y,B}}}(\Phi(\bar{x}))$ , we get $\langle\eta,\Phi(x_{p_{1}p_{2}})+p_{2}-\Phi(\bar{x})\rangle\leq 0$ . This gives us the relationships

[TABLE]

Using estimate (7.5) and letting $t\downarrow 0$ lead to $\eta=0$ . Thus the basic constraint qualification (7.3) is satisfied, which ensures that $\Lambda_{\mathrm{com}}(\bar{x})\neq\emptyset$ .

Next we pick any $\bar{\lambda}\in\Lambda_{\mathrm{com}}(\bar{x})$ and show that it is noncritical for the unperturbed KKT system (6.2) corresponding to $\bar{x}$ . Consider the KKT system for the perturbed problem (7.1) that can be written as

[TABLE]

Let $S_{KKT}\colon\mathbb{R}^{n}\times\mathbb{R}^{m}\rightrightarrows\mathbb{R}^{n}\times\mathbb{R}^{m}$ be the solution map to (7.6) given by

[TABLE]

Employing Theorem 5.1, we only need to prove that there exist numbers $\varepsilon>0$ and $\ell\geq 0$ as well as neighborhoods $U$ of $0\in\mathbb{R}^{n}$ and $W$ of $0\in\mathbb{R}^{m}$ such that for any $(p_{1},p_{2})\in U\times V$ and any $(x_{p_{1}p_{2}},\lambda_{p_{1}p_{2}})\in S_{KKT}(p_{1},p_{2})\cap(\mathbb{B}_{\varepsilon}(\bar{x})\times\mathbb{B}_{\varepsilon}(\bar{\lambda}))$ , estimate (5.3) holds with replacing $\Lambda(\bar{x})$ by the set of Lagrange multipliers $\Lambda_{\mathrm{com}}(\bar{x})$ taken from (6.3).

To this end we deduce from the full stability of $\bar{x}$ in (7.1) with $(\bar{p}_{1},\bar{p}_{2})=(0,0)$ due to the result of [14, Proposition 6.1] that there exist neighborhoods $\widetilde{U}\times\widetilde{W}$ of $(0,0)$ and $\widetilde{V}$ of $\bar{x}$ for which the set-valued mapping

[TABLE]

admits a Lipschitzian single-valued graphical localization on $\widetilde{U}\times\widetilde{W}\times\widetilde{V}$ . This means that there exists a Lipschitzian single-valued mapping $g\colon\widetilde{U}\times\widetilde{W}\mapsto\widetilde{V}$ such that $(\mathrm{gph}\,Q)\cap(\widetilde{U}\times\widetilde{W}\times\widetilde{V})=\mathrm{gph}\,g$ . Denote $U:=\widetilde{U}$ , $W:=\widetilde{W}$ and take $\varepsilon>0$ so small that $\mathbb{B}_{\varepsilon}(\bar{x})\subset\widetilde{V}$ . The Lipschitzian single-valued graphical localization property of $Q$ allows us to find a constant $\ell\geq 0$ such that for any $(p_{1},p_{2})\in U\times W$ and any $(x_{p_{1}p_{2}},\lambda_{p_{1}p_{2}})\in S_{KKT}(p_{1},p_{2})\cap\big{(}\mathbb{B}_{\varepsilon}(\bar{x})\times\mathbb{B}_{\varepsilon}(\bar{\lambda})\big{)}$ we have the inclusion $x_{p_{1}p_{2}}\in Q(p_{1},p_{2})$ , and hence

[TABLE]

Using now the error bound estimate (5.18) from the proof of Theorem 5.1 with replacing $\Lambda(\bar{x})$ by $\Lambda_{\mathrm{com}}(\bar{x})$ and adjusting $\varepsilon$ if necessary give us the semi-isolated calmness property (5.3), which is equivalent to the noncriticality of $\bar{\lambda}$ that was chosen arbitrary from the Lagrange multiplier set $\Lambda_{\mathrm{com}}(\bar{x})$ . This therefore completes the proof of theorem. $\square$

The result of Theorem 7.2 calls for the deriving verifiable conditions for full stability of local minimizers to (1.1) expressed entirely via the problem data and the given minimizer. Such conditions allow us to efficiently exclude slow convergence of primal-dual algorithms to seek fully stable minimizers based on the initial data. Some characterizations of full stability of local minimizers for ENLPs of type (1.1) are obtained in [14, Theorem 7.3] under rather strong assumptions. Relaxing these assumptions is a challenging goal of our future research.

8 Noncriticality and Lipschitzian Stability of Solutions to ENLPs

In this section we use the machinery developed above to investigate other notions of Lipschitzian stability, which occur to be related to noncriticality of multipliers for ENLPs. The following theorem provides characterizations of both isolated calmness and robust isolated calmness properties of the KKT solution map (7.7) associated with ENLP (1.1) in terms of the second-order sufficient condition (6.4) as well as noncriticality and uniqueness of Lagrange multipliers.

Theorem 8.1

(characterizations of robust isolated calmness of solution maps).* Let $\bar{x}$ be a feasible solution to ENLP (1.1) with $\theta$ taken from (1.2), and let $\bar{\lambda}\in\Lambda_{\mathrm{com}}(\bar{x})$ be a corresponding Lagrange multiplier from (6.3). The following assertions are equivalent:*

(i)

The solution map $S_{KKT}$ from (7.7) is robustly isolatedly calm at the point $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}\in\mathbb{R}^{n+m}\times\mathbb{R}^{n+m}$ , and $\bar{x}$ is a local optimal solution to (1.1).

(ii)

The second-order sufficient condition (6.4) holds, and $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ .

(iii)

$\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ , $\bar{x}$ is a local optimal solution to (1.1), and $\bar{\lambda}$ is a noncritical multiplier for (1.3) with $\Psi=\nabla_{x}L$ that is associated with the optimal solution $\bar{x}$ .

(iv)

$S_{KKT}$ * is isolatedly calm at $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ , and $\bar{x}$ is a local optimal solution to (1.1).*

Proof.

The outline of the proof is as follows. We sequentially verify implications (ii) $\Longrightarrow$ (iii), (iii) $\Longrightarrow$ (iv), (iv) $\Longrightarrow$ (iii), (iii) $\Longrightarrow$ (ii), and (i) $\iff$ (iv).

To prove (ii) $\Longrightarrow$ (iii), assume the validity of (6.4) and that $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ . Then Theorem 6.1 tells us that $\bar{x}$ is a strict local minimizer of (1.1) and that $\bar{\lambda}$ is a noncritical multiplier of (1.3) with $\Psi=\nabla_{x}L$ corresponding to $\bar{x}$ , and thus (iii) is satisfied.

Suppose next that all the conditions in (iii) hold. Since $\bar{\lambda}$ is noncritical, we derive the semi-isolated calmness of $S_{KKT}$ at $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ . This together with $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ results in the existence of a number $\ell\geq 0$ as well as neighborhoods $U$ of $(0,0)$ and $V$ of $(\bar{x},\bar{\lambda})$ such that

[TABLE]

Thus $S_{KKT}$ enjoys the isolated calmness property at $\big{(}(0,0,(\bar{x},\bar{\lambda}))\big{)}$ , and we arrive at (iv).

To verify the opposite implication (iv) $\Longrightarrow$ (iii), let us show that the isolated calmness of $S_{KKT}$ at $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ in (iv) yields $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ . Indeed, suppose on the contrary that $\Lambda_{\mathrm{com}}(\bar{x})$ is not a singleton. Then there exists $\widehat{\lambda}\in\Lambda_{\mathrm{com}}(\bar{x})$ with $\widehat{\lambda}\neq\bar{\lambda}$ . Since the set $\Lambda_{\mathrm{com}}(\bar{x})$ is convex, every point of the line segment connecting $\bar{\lambda}$ and $\widehat{\lambda}$ belongs to $\Lambda_{\mathrm{com}}(\bar{x})$ . The isolated calmness of $S_{KKT}$ at $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ amounts to (8.1), and hence we can find $\lambda^{\prime}\neq\bar{\lambda}$ with $\lambda^{\prime}\in\Lambda_{\mathrm{com}}(\bar{x})$ and such that $\lambda^{\prime}$ is sufficiently close to $\bar{\lambda}$ , i.e., $(\bar{x},\lambda^{\prime})\in V$ . Then it follows from (8.1) that

[TABLE]

which yields $\lambda^{\prime}=\bar{\lambda}$ , a contradiction ensuring that $\Lambda_{\mathrm{com}}(\bar{x})$ is a singleton. Theorem 5.1 tells us that $\bar{\lambda}$ is a noncritical multiplier of (1.3) corresponding to $\bar{x}$ , and thus (iii) holds.

Next we verify implication (iii) $\Longrightarrow$ (ii). Let us first deduce from $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ in (iii) that the qualification condition (7.3) in (ii) is satisfied. Supposing the contrary, find a normal $v\in N_{{\rm dom}\,\theta_{Y,B}}(\Phi(\bar{x}))$ with $v\neq 0$ such that $\nabla\Phi(\bar{x})^{*}v=0$ . Letting $\lambda^{\prime}:=\bar{\lambda}+v$ , we get $\lambda^{\prime}\neq\bar{\lambda}$ and $\nabla_{x}L(\bar{x},\lambda^{\prime})=0$ for the Lagrangian function (6.1). By the choice of $v$ and the normal cone definition (2.5) we get from the above that

[TABLE]

which shows that $\lambda^{\prime}\in\partial\theta_{Y,B}(\Phi(\bar{x}))$ and hence $\lambda^{\prime}\in\Lambda_{\mathrm{com}}(\bar{x})$ due to $\nabla_{x}L(\bar{x},\lambda^{\prime})=0$ . Since $\lambda^{\prime}\neq\bar{v}$ , it gives us a contradiction with the assumption of $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ in (iii) and thus justifies the validity of the qualification condition (7.3). Employing now Lemma 7.1 tells us that the second-order necessary optimality condition (7.4) is satisfied.

To finish the verification of (iii) $\Longrightarrow$ (ii), we need to prove that the second-order sufficient optimality condition (6.4) holds under the assumptions in (iii). Supposing the contrary gives us a nonzero element $\xi_{0}\in\{w|\;\nabla\Phi(\bar{x})w\in{\cal K}^{*}+{\rm rge\,}B\}$ such that

[TABLE]

Since $\Lambda_{\mathrm{com}}(\bar{x})=\{\bar{\lambda}\}$ , it is easy to see that the second-order necessary condition (7.4) can be equivalently written as

[TABLE]

Furthermore, employing the equalities

[TABLE]

allows us to deduce from the equivalent form of the second-order necessary condition that

[TABLE]

This in turn implies that the vector $\xi_{0}$ is an optimal solution to the problem

[TABLE]

Applying the subdifferential Fermat rule to the latter problem and then using the elementary sum rule for convex subgradients together with the chain rule from [18, Exercise 10.22(b)] yield

[TABLE]

where the last equality comes from (2.10). Since $\xi_{0}\neq 0$ , it shows by Definition 3.1 that $\bar{\lambda}$ is a critical multiplier. This contradicts the assumption in (iii) that $\bar{\lambda}$ is a noncritical multiplier and therefore verifies the validity of (6.4) and the entire implication (iii) $\Longrightarrow$ (ii).

Our next step is to prove implication (i) $\Longrightarrow$ (iv), which clearly holds. To complete the proof of the theorem, it remains to verify implication (iv) $\Longrightarrow$ (i). To achieve this implication, we only need to show that there are neighborhoods $U$ of $(0,0)$ and $V$ of $(\bar{x},\bar{\lambda})$ such that $S_{KKT}(p_{1},p_{2})\cap V\neq\emptyset$ for all $(p_{1},p_{2})\in U$ . To this end, define the set-valued mapping $Q\colon\mathbb{R}^{m}\rightrightarrows\mathbb{R}^{n}$ by

[TABLE]

Having already proved (iv) and (iii) are equivalent, we have the qualification condition (7.3) because of the assumptions in (iii). As proved above, (iii) and (ii) are equivalent. Thus the second-order sufficient condition (6.4) is satisfied and implies by Theorem 6.1 that $\bar{x}$ is a strict local minimizer for (1.1). This gives a neighborhood $O$ of $\bar{x}$ for which we have

[TABLE]

Applying [9, Theorem 4.37(ii)] to the mapping $Q$ with the initial point $(0,\bar{x})$ gives us numbers $r>0$ and $\ell\geq 0$ such that

[TABLE]

where $r$ can be chosen such that $\mathbb{B}_{r}(\bar{x})\subset O$ . Consider now the optimization problem

[TABLE]

It is clear that this problem admits an optimal solution $x_{p_{1}p_{2}}$ for any pair $(p_{1},p_{2})\in\mathbb{R}^{n}\times\mathbb{B}_{r}(0)$ since the cost function therein is lower semicontinuous while the constraint set is obviously compact. Let us now show that there is a number $\varepsilon>0$ with $\mathbb{B}_{\varepsilon}(0,0)$ such that

[TABLE]

Suppose the contrary and then find sequences $(p_{1k},p_{2k})\rightarrow(0,0)$ and $x_{p_{1k}p_{2k}}$ for which $\|{x_{p_{1k}p_{2k}}}-\bar{x}\|=r$ . We get without loss of generality that $x_{p_{1k}p_{2k}}\rightarrow x_{0}$ as $k\to\infty$ and so $\|{x_{0}}-\bar{x}\|=r$ . This yields $x_{0}\neq\bar{x}$ . Since $x_{p_{1k}p_{2k}}$ is an optimal solution to (8.4), it follows that

[TABLE]

for all $x\in\mathbb{B}_{r}(\bar{x})\cap Q(p_{2k})$ . Pick any $x\in\mathbb{B}_{\frac{r}{2}}(\bar{x})\cap Q(0)$ and $k\in\mathbb{N}$ so large that $p_{2k}\in\alpha\mathbb{B}$ with $\alpha<\min\{\frac{r}{2\ell},r\}$ . It follows from (8.3) that there exist $x^{\prime}\in Q(p_{2k})$ and $b\in\mathbb{B}$ satisfying

[TABLE]

Thus $x^{\prime}\in\mathbb{B}_{r}(\bar{x})\cap Q(p_{2k})$ , and it follows from (8.6) that

[TABLE]

Passing to the limit at the latter inequality as $k\rightarrow\infty$ gives us the estimate

[TABLE]

which holds for all $x\in\mathbb{B}_{\frac{r}{2}}(\bar{x})\cap Q(0)$ . In particular, we have

[TABLE]

which contradicts (8.2) since $x_{0}\neq\bar{x}$ and $x_{0}\in\mathbb{B}_{r}(\bar{x})\subset O$ , and thus we arrive at (8.5).

At the last step of the proof, denote by ${\Lambda}_{\mathrm{com}}(x_{p_{1}p_{2}})$ be the set of Lagrange multipliers associated with the optimal solution $x_{p_{1}p_{2}}$ to problem (8.4). It follows from the validity of the qualification condition (7.3) and its robustness with respect to perturbations of the initial point that this qualification condition is also satisfied for the perturbed problem (8.4). This implies in turn that ${\Lambda}_{\mathrm{com}}(x_{p_{1}p_{2}})\neq\emptyset$ for all $(p_{1},p_{2})$ sufficiently close to $(0,0)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ . Assume without loss of generality that ${\Lambda}_{\mathrm{com}}(x_{p_{1}p_{2}})\neq\emptyset$ for all $(p_{1},p_{2})\in\mathbb{B}_{\varepsilon}(0,0)$ , where $\varepsilon$ is taken from (8.5). Using a similar argument as (5.12) and (5.18) via the Hoffman lemma gives us a constant $\ell^{\prime}\geq 0$ such that for any $(p_{1},p_{2})\in\mathbb{B}_{\varepsilon}(0,0)$ and any $\lambda_{p_{1}p_{2}}\in{\Lambda}_{\mathrm{com}}(x_{p_{1}p_{2}})$ we have

[TABLE]

This clearly proves the existence of a neighborhood $V$ of $(\bar{x},\bar{\lambda})$ such that $S_{KKT}(p_{1},p_{2})\cap V\neq\emptyset$ for all $(p_{1},p_{2})\in\mathbb{B}_{\varepsilon}(0,0)$ and so finishes the proof of implication (iv) $\Longrightarrow$ (i). $\square$

The final piece of this paper concerns yet another well-recognized Lipschitzian type property, which seems to be the most natural extension of robust Lipschitzian behavior to set-valued mapping. For this reason we label it as the Lipschitz-like property [9] while it is also known as the pseudo-Lipschitz or Aubin one. It is said that a set-valued mapping/multifunction $F\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{m}$ is Lipschitz-like around $(\bar{x},\bar{y})\in\mathrm{gph}\,F$ if there exists a constant $\ell\geq 0$ together with neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ such that we have the inclusion

[TABLE]

To formulate a convenient characterization of property (8.8), we recall first the notion of the normal cone to a set $\Omega\subset\mathbb{R}^{n}$ at a point $\bar{x}\in\Omega$ defined by

[TABLE]

The coderivative of a set-valued mapping $F\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{m}$ at $(\bar{x},\bar{y})\in\mathrm{gph}\,F$ is given by

[TABLE]

The following characterization of the Lipschitz-like property for any closed-graph mapping $F\colon\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{m}$ around $(\bar{x},\bar{y})\in\mathrm{gph}\,F$ is known as the Mordukhovich criterion from [18, Theorem 9.40], where the proof is different from the original one; see [8, Theorem 5.7] as well as its infinite-dimensional extension given in [9, Theorem 4.10]:

[TABLE]

Note the results obtained therein provide also a precise computation of the exact bound/infimum of Lipschitzian moduli $\{\ell\}$ in (8.8) via the coderivative norm at $(\bar{x},\bar{y})$ .

Full coderivative calculus developed for coderivatives, which is based on variational/extremal principles of variational analysis and can be found in [11, 9, 18], allows us apply the general characterization (8.9) to specific multifunctions given in some structural forms. The next theorem employs (8.9) and coderivative calculus to characterize the Lipschitz-like property of the solution map (7.7) to the canonically perturbed KKT system (7.6).

Theorem 8.2

(Lipschitz-like property of solution maps).* Let $(\bar{x},\bar{\lambda})\in S_{KKT}(0,0)$ for the solution map $S_{KKT}$ defined in (7.7) with $\theta$ taken from (1.2). Then $S_{KKT}$ is Lipschitz-like around $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ if and only if we have the implication*

[TABLE]

Proof.

Consider the mapping $G$ from (4.4) with $\Psi=\nabla_{x}L$ . We easily deduce from the coderivative definition and the form of $S$ that

[TABLE]

for all $(\xi,\eta)\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ and $(w_{1},w_{2})\in\mathbb{R}^{n}\times\mathbb{R}^{m}$ . Using the structure of $G$ and employing the coderivative sum rule in the equality form from [11, Theorem 3.9] yield

[TABLE]

It follows from (8.11) and the coderivative criterion (8.8) that $S_{KKT}$ is Lipschitz-like around $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ if and only if we have the implication

[TABLE]

which leads us together the coderivative representation for $G$ in (8.12) to characterization (8.10) of the Lipschitz-like property of the solution map $S_{KKT}$ . $\square$

Combining finally the obtained characterization of the Lipschitz-like property in Theorem 8.2 with some known facts of variational analysis allows us to reveal a relationship between the latter property of the solution map $S_{KKT}$ and its isolated calmness at the same point.

Theorem 8.3

(Lipschitz-like property of solution maps implies their isolated calmness).* Let $S_{KKT}$ be the solution map (7.7) of the canonically perturbed KKT system (7.6) with the piecewise linear-quadratic term (1.2), and let $(\bar{x},\bar{\lambda})\in S_{KKT}(0,0)$ . If $S_{KKT}$ is Lipschitz-like around $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ , then it enjoys the isolated calmness property at this point.*

Proof.

Assuming that $S_{KKT}$ has the Lipschitz-like property around $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ , we get implication (8.10) by Theorem 8.2. On the other hand, we proceed similarly to the proof of Theorem 8.2 and get counterparts of the equalities in (8.11) and (8.12) with replacing the coderivative by the graphical derivative therein. The latter one is due to the easily checkable sum rule for graphical derivatives of summations with one smooth term as in (4.4). Having this, we apply the Levy-Rockafellar criterion of isolated calmness (4.3) to the solution map (7.7) and thus conclude that the isolated calmness of $S_{KKT}$ at $\big{(}(0,0),(\bar{x},\bar{\lambda})\big{)}$ is equivalent to

[TABLE]

Comparing (8.10) and (8.13), we see that the only difference is in terms involving $(D^{*}\partial\theta_{Y,B})(\Phi(\bar{x}),\bar{\lambda})$ and $(D\partial\theta_{Y,B})(\Phi(\bar{x}),\bar{\lambda})$ . To this end we use derivative-coderivative relationship from [18, Theorem 13.57], which tells us that the inclusion

[TABLE]

holds under the assumptions that are automatically satisfied for the piecewise linear-quadratic function $\theta_{Y,B}$ from (1.2). This therefore completes the proof of the theorem. $\square$

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems , Springer, New York, 2000.
2[2] C. Ding, D. Sun and L. Zhang, Characterization of the robust isolated calmness for a class of conic programming problems, SIAM J. Optim. SIAM J. Optim. 27 (2017), 67–90.
3[3] A. L. Dontchev and R. T. Rockafellar, Implicit Functions and Solution Mappings: A View from Variational Analysis , 2nd edition, Springer, New York, 2014.
4[4] A. F. Izmailov, On the analytical and numerical stability of critical Lagrange multipliers, Comput. Math. Math. Phys. 45 (2005), 930–946.
5[5] A. F. Izmailov and M. V. Solodov, Newton-Type Methods for Optimization and Variational Problems , Springer, New York, 2014.
6[6] A. F. Izmailov and M. V. Solodov, Critical Lagrange multipliers: what we currently know about them, how they spoil our lives, and what we can do about it, TOP 23 (2015), 1–26.
7[7] A. B. Levy, R. A. Poliquin and R. T. Rockafellar, Stability of locally optimal solutions, SIAM J. Optim. 10 (2000), 580–604.
8[8] B. S. Mordukhovich, Complete characterizations of openness, metric regularity, and Lipschitzian properties of multifunctions, Trans. Amer. Math. Soc. 340 (1993), 1-35.