Strong Metric Subregularity of Mappings in Variational Analysis and   Optimization

Radek Cibulka; Asen Dontchev; Alexander Kruger

arXiv:1701.02078·math.OC·May 15, 2018

Strong Metric Subregularity of Mappings in Variational Analysis and Optimization

Radek Cibulka, Asen Dontchev, Alexander Kruger

PDF

TL;DR

This paper explores the properties, stability, and applications of strong metric subregularity in variational analysis, extending classical theorems and providing conditions for superlinear convergence of Newton-type methods.

Contribution

It demonstrates the stability of strong metric subregularity under perturbations, extends criteria involving graphical derivatives, and analyzes convergence of Newton's methods in this context.

Findings

01

Strong metric subregularity is stable under small calmness perturbations.

02

Extension of Rockafellar's criterion to infinite-dimensional spaces.

03

Superlinear convergence of Newton's method under strong metric subregularity.

Abstract

Although the property of strong metric subregularity of set-valued mappings has been present in the literature under various names and with various definitions for more than two decades, it has attracted much less attention than its older "siblings", the metric regularity and the strong metric regularity. The purpose of this paper is to show that the strong metric subregularity shares the main features of these two most popular regularity properties and is not less instrumental in applications. We show that the strong metric subregularity of a mapping F acting between metric spaces is stable under perturbations of the form f + F, where f is a function with a small calmness constant. This result is parallel to the Lyusternik-Graves theorem for metric regularity and to the Robinson theorem for strong regularity, where the perturbations are represented by a function f with a small…

Equations388

\chi(\mathcal{A})=\inf\bigg{\{}r>0{\,\big{|}\,}\ \mathcal{A}\subset\bigcup\Big{\{}{I\kern-3.15005ptB}_{r}(A){\,\big{|}\,}\ A\in{\mathcal{B}}\Big{\}},\ {\mathcal{B}}\subset\mathcal{A}\ {\rm finite}\bigg{\}}.

\chi(\mathcal{A})=\inf\bigg{\{}r>0{\,\big{|}\,}\ \mathcal{A}\subset\bigcup\Big{\{}{I\kern-3.15005ptB}_{r}(A){\,\big{|}\,}\ A\in{\mathcal{B}}\Big{\}},\ {\mathcal{B}}\subset\mathcal{A}\ {\rm finite}\bigg{\}}.

d\big{(}x,F^{-1}(y)\big{)}\leq\kappa d\big{(}y,F(x)\big{)}\quad\text{for every }(x,y)\in U\times V.

d\big{(}x,F^{-1}(y)\big{)}\leq\kappa d\big{(}y,F(x)\big{)}\quad\text{for every }(x,y)\in U\times V.

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, F (x) \cap V) for all x \in U .

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, F (x) \cap V) for all x \in U .

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, F (x)) for all x \in U .

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, F (x)) for all x \in U .

F^{- 1} (y) \cap U \subset I B_{μ ρ (y, \overset{y}{ˉ})} (\overset{x}{ˉ}) for all y \in V .

F^{- 1} (y) \cap U \subset I B_{μ ρ (y, \overset{y}{ˉ})} (\overset{x}{ˉ}) for all y \in V .

d\big{(}x,F^{-1}(\bar{y})\big{)}\leq\kappa d\big{(}\bar{y},F(x)\big{)}\quad\text{for every }x\in U.

d\big{(}x,F^{-1}(\bar{y})\big{)}\leq\kappa d\big{(}\bar{y},F(x)\big{)}\quad\text{for every }x\in U.

g (x) \geq g (\overset{x}{ˉ}) + ⟨ \overset{x}{ˉ}^{*}, x - \overset{x}{ˉ} ⟩ + β ∥ x - \overset{x}{ˉ} ∥^{2} \mbox w h e n e v er x \in I B_{δ} (\overset{x}{ˉ}),

g (x) \geq g (\overset{x}{ˉ}) + ⟨ \overset{x}{ˉ}^{*}, x - \overset{x}{ˉ} ⟩ + β ∥ x - \overset{x}{ˉ} ∥^{2} \mbox w h e n e v er x \in I B_{δ} (\overset{x}{ˉ}),

ρ (g (x), g (\overset{x}{ˉ})) \leq μ ρ (x, \overset{x}{ˉ}) for every x \in U \cap dom g .

ρ (g (x), g (\overset{x}{ˉ})) \leq μ ρ (x, \overset{x}{ˉ}) for every x \in U \cap dom g .

subreg (g + G; \overset{x}{ˉ} ∣ \overset{y}{ˉ} + g (\overset{x}{ˉ})) \leq \frac{κ}{1 - κ μ} .

subreg (g + G; \overset{x}{ˉ} ∣ \overset{y}{ˉ} + g (\overset{x}{ˉ})) \leq \frac{κ}{1 - κ μ} .

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, G (x)) and ρ (g (x), g (\overset{x}{ˉ})) \leq μ ρ (x, \overset{x}{ˉ}) for all x \in I B_{a} (\overset{x}{ˉ}) \cap dom g .

ρ (x, \overset{x}{ˉ}) \leq κ d (\overset{y}{ˉ}, G (x)) and ρ (g (x), g (\overset{x}{ˉ})) \leq μ ρ (x, \overset{x}{ˉ}) for all x \in I B_{a} (\overset{x}{ˉ}) \cap dom g .

ρ (x, \overset{x}{ˉ})

ρ (x, \overset{x}{ˉ})

\rho(x,\bar{x})\leq\frac{\kappa}{1-\kappa\mu}d\big{(}\bar{y}+g(\bar{x}),(g+G)(x)\big{)}.

\rho(x,\bar{x})\leq\frac{\kappa}{1-\kappa\mu}d\big{(}\bar{y}+g(\bar{x}),(g+G)(x)\big{)}.

subreg (h + F; \overset{x}{ˉ} ∣ h (\overset{x}{ˉ}) + \overset{y}{ˉ}) \cdot clm (f - h; \overset{x}{ˉ}) < 1.

subreg (h + F; \overset{x}{ˉ} ∣ h (\overset{x}{ˉ}) + \overset{y}{ˉ}) \cdot clm (f - h; \overset{x}{ˉ}) < 1.

subreg (f + F; \overset{x}{ˉ} ∣ f (\overset{x}{ˉ}) + \overset{y}{ˉ}) \leq \frac{subreg ( h + F ; x ˉ ∣ h ( x ˉ ) + y ˉ )}{1 - subreg ( h + F ; x ˉ ∣ h ( x ˉ ) + y ˉ ) \cdot clm ( f - h ; x ˉ )} .

subreg (f + F; \overset{x}{ˉ} ∣ f (\overset{x}{ˉ}) + \overset{y}{ˉ}) \leq \frac{subreg ( h + F ; x ˉ ∣ h ( x ˉ ) + y ˉ )}{1 - subreg ( h + F ; x ˉ ∣ h ( x ˉ ) + y ˉ ) \cdot clm ( f - h ; x ˉ )} .

subreg (f + F; \overset{x}{ˉ} ∣ f (\overset{x}{ˉ}) + \overset{y}{ˉ}) = subreg (h + F; \overset{x}{ˉ} ∣ h (\overset{x}{ˉ}) + \overset{y}{ˉ}) .

subreg (f + F; \overset{x}{ˉ} ∣ f (\overset{x}{ˉ}) + \overset{y}{ˉ}) = subreg (h + F; \overset{x}{ˉ} ∣ h (\overset{x}{ˉ}) + \overset{y}{ˉ}) .

h + F for h = f (\overset{p}{ˉ}, \overset{x}{ˉ}) + D_{x} f (\overset{p}{ˉ}, \overset{x}{ˉ}) (\cdot - \overset{x}{ˉ})

h + F for h = f (\overset{p}{ˉ}, \overset{x}{ˉ}) + D_{x} f (\overset{p}{ˉ}, \overset{x}{ˉ}) (\cdot - \overset{x}{ˉ})

clm (S; \overset{p}{ˉ} ∣ \overset{x}{ˉ}) \leq subreg (h + F; \overset{x}{ˉ} ∣ 0) \cdot ∥ D_{p} f (\overset{p}{ˉ}, \overset{x}{ˉ}) ∥.

clm (S; \overset{p}{ˉ} ∣ \overset{x}{ˉ}) \leq subreg (h + F; \overset{x}{ˉ} ∣ 0) \cdot ∥ D_{p} f (\overset{p}{ˉ}, \overset{x}{ˉ}) ∥.

Ψ : (x, y) \mapsto {p ∣ f (p, x) - h (x) + y = 0} for (x, y) \in X \times Y .

Ψ : (x, y) \mapsto {p ∣ f (p, x) - h (x) + y = 0} for (x, y) \in X \times Y .

x \to 0, x \neq = 0 lim inf \frac{∥ A x ∥}{∥ x ∥} = ∥ h ∥ = 1 in f ∥ A h ∥ > 0.

x \to 0, x \neq = 0 lim inf \frac{∥ A x ∥}{∥ x ∥} = ∥ h ∥ = 1 in f ∥ A h ∥ > 0.

∥ x ∥ \leq κ ∥ A x ∥ \mbox f or an y x \in X .

∥ x ∥ \leq κ ∥ A x ∥ \mbox f or an y x \in X .

A ({x_{k}}) = {k^{- 1} x_{k}}_{k = 1}^{+ \infty}, {x_{k}} \in ℓ_{\infty} .

A ({x_{k}}) = {k^{- 1} x_{k}}_{k = 1}^{+ \infty}, {x_{k}} \in ℓ_{\infty} .

∥ {x_{k}} ∥_{\infty} \leq κ ∥ A ({x_{k}}) ∥_{2} for all {x_{k}} \in a I B_{ℓ_{\infty}} .

∥ {x_{k}} ∥_{\infty} \leq κ ∥ A ({x_{k}}) ∥_{2} for all {x_{k}} \in a I B_{ℓ_{\infty}} .

a \leq κ \frac{a}{n} < a,

a \leq κ \frac{a}{n} < a,

∥ f (u) - f (\overset{x}{ˉ}) - A (u - \overset{x}{ˉ}) ∥ \leq c ∥ u - \overset{x}{ˉ} ∥;

∥ f (u) - f (\overset{x}{ˉ}) - A (u - \overset{x}{ˉ}) ∥ \leq c ∥ u - \overset{x}{ˉ} ∥;

X ∋ x \mapsto H_{A} (x) := f (\overset{x}{ˉ}) + A (x - \overset{x}{ˉ}) + F (x)

X ∋ x \mapsto H_{A} (x) := f (\overset{x}{ˉ}) + A (x - \overset{x}{ˉ}) + F (x)

(c + χ (A)) \cdot m < 1,

(c + χ (A)) \cdot m < 1,

m := A \in A sup subreg (H_{A}; \overset{x}{ˉ} ∣ \overset{y}{ˉ}) .

m := A \in A sup subreg (H_{A}; \overset{x}{ˉ} ∣ \overset{y}{ˉ}) .

subreg (f + F; \overset{x}{ˉ} ∣ \overset{y}{ˉ}) \leq \frac{m}{1 - ( c + χ ( A )) \cdot m} .

subreg (f + F; \overset{x}{ˉ} ∣ \overset{y}{ˉ}) \leq \frac{m}{1 - ( c + χ ( A )) \cdot m} .

(c + χ (A) + γ) κ < 1.

(c + χ (A) + γ) κ < 1.

\|x-\bar{x}\|\leq\frac{\kappa}{1-\kappa(\chi(\mathcal{A})+\gamma)}d\big{(}{\bar{y}},H_{A}(x)\big{)}\quad\mbox{whenever}\quad x\in{I\kern-3.15005ptB}_{a}(\bar{x})\quad\mbox{and}\quad A\in\mathcal{A}.

\|x-\bar{x}\|\leq\frac{\kappa}{1-\kappa(\chi(\mathcal{A})+\gamma)}d\big{(}{\bar{y}},H_{A}(x)\big{)}\quad\mbox{whenever}\quad x\in{I\kern-3.15005ptB}_{a}(\bar{x})\quad\mbox{and}\quad A\in\mathcal{A}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Strong Metric Subregularity of Mappings

in Variational Analysis and Optimization

R. Cibulka1, A. L. Dontchev2 and A. Y. Kruger3

Dedicated to the memory of Jonathan M. Borwein

Abstract

Although the property of strong metric subregularity of set-valued mappings has been present in the literature under various names and with various (equivalent) definitions for more than two decades, it has attracted much less attention than its older “siblings”, the metric regularity and the strong (metric) regularity. The purpose of this paper is to show that the strong metric subregularity shares the main features of these two most popular regularity properties and is not less instrumental in applications. We show that the strong metric subregularity of a mapping $F$ acting between metric spaces is stable under perturbations of the form $f+F$ , where $f$ is a function with a small calmness constant. This result is parallel to the Lyusternik-Graves theorem for metric regularity and to the Robinson theorem for strong regularity, where the perturbations are represented by a function $f$ with a small Lipschitz constant. Then we study perturbation stability of the same kind for mappings acting between Banach spaces, where $f$ is not necessarily differentiable but admits a set-valued derivative-like approximation. Strong metric $q$ -subregularity is also considered, where $q$ is a positive real constant appearing as exponent in the definition. Rockafellar’s criterion for strong metric subregularity involving injectivity of the graphical derivative is extended to mappings acting in infinite-dimensional spaces. A sufficient condition for strong metric subregularity is established in terms of surjectivity of the Fréchet coderivative, and it is shown by a counterexample that surjectivity of the limiting coderivative is not a sufficient condition for this property, in general. Then various versions of Newton’s method for solving generalized equations are considered including inexact and semismooth methods, for which superlinear convergence is shown under strong metric subregularity. As applications to optimization, a characterization of the strong metric subregularity of the KKT mapping is obtained, as well as a radius theorem for the optimality mapping of a nonlinear programming problem. Finally, an error estimate is derived for a discrete approximation in optimal control under strong metric subregularity of the mapping involved in the Pontryagin principle.

Key Words. strong metric subregularity, perturbations and approximations, generalized derivatives, Newton’s method, nonlinear programming, optimal control.

AMS Subject Classification (2010) 49J53, 49K40, 90C31.

1Department of Mathematics, Faculty of Applied Sciences, University of West Bohemia, Univerzitní 22, 306 14 Pilsen, Czech Republic, [email protected]. Supported by the project GA15-00735S.

2Mathematical Reviews, 416 Fourth Street, Ann Arbor, MI 48107-8604, USA, [email protected]; Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology, Wiedner Hauptstrasse 8, A-1040, Austria. Supported by NSF, grant 1562209, the Austrian Science Foundation (FWF), grant P26640-N25, and the Australian Research Council, project DP160100854.

3 Centre for Informatics and Applied Optimization, Federation University Australia, POB 663, Ballarat, VIC 3350, Australia, [email protected]. Supported by the Australian Research Council, project DP160100854.

1 Introduction

There are three basic properties of linear mappings in analysis and topology: surjectivity, injectivity and invertibility. Specifically, a linear and bounded mapping $A$ acting from a Banach space $X$ to a Banach space $Y$ is said to be surjective when for every $y\in Y$ there exists $x\in X$ such that $Ax=y$ ; it is said to be injective when $Ax=0$ implies $x=0$ ; it is said to be invertible when for every $y\in Y$ there exists a unique $x\in X$ such that $Ax=y$ . The combination of surjectivity and injectivity implies invertibility and in this case the inverse mapping $A^{-1}$ is linear and bounded. When $X=Y=\mathbb{R}^{n}$ all three properties are equivalent. An extension of surjectivity to nonlinear/set-valued mappings which goes back to the Banach open mapping principle is the well-known property of metric regularity, a name coined by Borwein in [3]. An extension of invertibility, which is particularly useful in optimization, is known as strong metric regularity, a property introduced by Robinson in [32]. In this paper we focus on an extension of injectivity to nonlinear/set-valued mappings called strong metric subregularity, for which in this paper we also use the name “strong subregularity” for short. Although this property has been present in the literature under various names and with various (mostly equivalent) definitions for more than two decades, it has attracted much less attention than its older “siblings”, the metric regularity and the strong (metric) regularity. The purpose of this paper is to demonstrate that the strong subregularity shares the main features of the other two regularity properties and is not less instrumental in applications.

To put the stage, let us first fix the notations and terminology. Throughout, $X$ and $Y$ are metric spaces in general and any metric is denoted by $\rho(\cdot,\cdot)$ . The space $Y$ also appears as a linear metric space with shift invariant metric, that is, a metric with the property that $\rho(y_{1}+y,y_{2}+y)=\rho(y_{1},y_{2})$ for all $y_{1},y_{2},y\in Y$ . Both $X$ and $Y$ could also be Banach spaces and this is always explicitly stated or clear from the context. A norm is generally denoted by $\|\cdot\|$ , sometimes with a subscript indicating a specific space. The $n$ -dimensional Euclidean space is denoted by $\mathbb{R}^{n}$ and the set of nonnegative integers is denoted by ${\bf N}$ . The distance from a point $x$ to a set $A$ in a metric space is ${d(x,A)}=\inf_{y\in A}\rho(x,y)$ ; the distance to the empty set is always $+\infty$ . The closed ball centered at $x$ with radius $r$ is denoted by ${I\kern-3.15005ptB}_{r}(x)$ and the closed unit ball is ${I\kern-3.15005ptB}$ . A set $U$ is said to be a neighborhood of a point $x$ when there exists a real $r>0$ such that ${I\kern-3.15005ptB}_{r}(x)\subset U$ .

A set-valued mapping $F$ acting from $X$ to the subsets of $Y$ , denoted $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ , is associated with its graph $\,\mathop{\rm gph}\nolimits F=\big{\{}\,(x,y)\in X\times Y\,\big{|}\,y\in F(x)\big{\}}$ , its domain $\,\mathop{\rm dom}\nolimits F=\big{\{}\,x\in X\,\big{|}\,F(x)\neq\emptyset\big{\}}$ and its range $\,\mathop{\rm rge}\nolimits F=\big{\{}\,y\in Y\,\big{|}\,\exists\,x\in X\text{with}y\in F(x)\big{\}}$ . The inverse of $F$ is defined as $y\mapsto F^{-1}(y)=\big{\{}\,{x\in X}\,\big{|}\,y\in F(x)\big{\}}$ . The space of all linear bounded (single-valued) mappings acting between Banach spaces $X$ and $Y$ and equipped with the standard operator norm is denoted by ${\mathcal{L}}(X,Y)$ . A mapping $H$ acting between Banach spaces $X$ and $Y$ is said to be positively homogeneous when its graph is a cone. For a positively homogeneous mapping $H:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ the expression $\sup_{\|x\|\leq 1}d(0,H({x}))$ is said to be the inner norm of $H$ and denoted by $\|H\|^{-}$ , while the expression $\sup_{\|x\|\leq 1}\sup_{y\in H(x)}\|y\|$ is the outer norm of $H$ and denoted by $\|H\|^{+}$ . Also, recall that the measure of non-compactness [1] of a set $\mathcal{A}$ is defined as

[TABLE]

Given a (set-valued) mapping $F$ acting from a metric space $X$ to (the subsets of) a metric space $Y$ , a point $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ and neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ , the submapping $U\ni x\mapsto F(x)\cap V$ is said to be a graphical localization at $\bar{x}$ for $\bar{y}$ . Local invertibility of $F$ at $(\bar{x},\bar{y})$ is identified with $F^{-1}$ having a localization at $\bar{y}$ for $\bar{x}$ which is single-valued (a function). The most known manifestation of invertibility of a (nonlinear) function is the classical inverse function theorem: the inverse $f^{-1}$ of a strictly differentiable at $\bar{x}$ function $f:X\to Y$ between Banach spaces has a strictly differentiable at $f(\bar{x})$ single-valued localization at $f(\bar{x})$ for $\bar{x}$ if and only if the strict derivative $Df(\bar{x})$ is invertible. For a general mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ , the property that $F^{-1}$ has a Lipschitz continuous single-valued localization at $\bar{y}$ for $\bar{x}$ is known as strong metric regularity of $F$ at $\bar{x}$ for $\bar{y}$ . In this paper we also use the shorter name strong regularity as in Robinson’s original definition in [32] which, strictly speaking, is somewhat different but is based on the same idea.

A mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ is said to be metrically regular at $\bar{x}$ for $\bar{y}$ when $\bar{y}\in F(\bar{x})$ , $\mathop{\rm gph}\nolimits F$ is locally closed at $(\bar{x},\bar{y})$ , meaning that there exists a neighborhood $W$ of $(\bar{x},\bar{y})$ such that the set $\mathop{\rm gph}\nolimits F\cap W$ is closed in $X\times Y$ , and there is a constant $\kappa\geq 0$ along with neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ such that

[TABLE]

The infimum of $\kappa\geq 0$ for which there exist neighborhoods $U$ and $V$ such that (1) holds is called the regularity modulus of $F$ and denoted $\mathop{\rm reg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . We use the convention that $\mathop{\rm reg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})<+\infty$ if and only if $F$ is metrically regular at $\bar{x}$ for $\bar{y}$ . A mapping $A\in{\mathcal{L}}(X,Y)$ is metrically regular at any point if and only if it is surjective in which case $\mathop{\rm reg}\nolimits A=\|A^{-1}\|^{-}$ ; this comes from the Banach open mapping principle. A mapping $F$ is strongly regular at $\bar{x}$ for $\bar{y}$ if and only if $F$ is metrically regular at $\bar{x}$ for $\bar{y}$ and the inverse $F^{-1}$ has a graphical localization at $\bar{y}$ for $\bar{x}$ which is nowhere multivalued; in this case for every $\ell>\mathop{\rm reg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ there exists a neighborhood of $\bar{y}$ where the localization is Lipschitz continuous with a Lipschitz constant $\ell$ .

A generally set-valued mapping $F$ acting from a metric space $X$ to the subsets of a metric space $Y$ is said to be strongly metrically subregular at $\bar{x}$ for $\bar{y}$ when $\bar{y}\in F(\bar{x})$ and there is a constant $\kappa\geq 0$ along with neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ such that

[TABLE]

This property can be equivalently defined, see [15, Section 3I, p. 194] with just one neighborhood $U$ by adjusting its size, as follows: there is a constant $\kappa\geq 0$ along with a neighborhood $U$ of $\bar{x}$ such that

[TABLE]

Either definition yields that $\bar{x}$ is the only point in $U$ such that $\bar{y}\in F(\bar{x})$ ; that is, $\bar{x}$ is an isolated point of $F^{-1}(\bar{y})$ . The infimum of $\kappa\geq 0$ over neighborhoods $U$ and $V$ such that (2) holds (or over $U$ such that (3) holds) is called the subregularity modulus of $F$ and denoted by $\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . We adopt the convention that $\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})=+\infty$ whenever $F$ is not strongly subregular at $\bar{x}$ for $\bar{y}$ . Note that we do not assume that the graph of $F$ is locally closed at the reference point in the definition of strong subregularity. A mapping $A\in{\cal L}(X,Y)$ whose range is closed is strongly subregular everywhere if and only if it is injective; in this case $\mathop{\rm subreg}\nolimits A=\|A^{-1}\|^{+}$ ; note that in finite dimensions the range of a linear bounded mapping is always closed.

There is a close connection between strong metric subregularity and the properties of the distance function $(x,y)\mapsto d(y,F(x))$ , see [15, Theorem 3I.5]. Directly from the definition it follows that a set-valued mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if $\bar{x}$ is a local sharp minimizer of the function $x\mapsto d(\bar{y},F(x))$ . Recall that a point $\bar{x}\in\mathop{\rm dom}\nolimits\varphi$ is called a local sharp minimizer of a function $\varphi:X\to\mathbb{R}\cup\{+\infty\}$ whenever there is a neighborhood $U$ of $\bar{x}$ and a constant $\beta>0$ such that $\varphi(x)\geq\varphi(\bar{x})+\beta\rho(x,\bar{x})\;\text{for all}x\in U.$

A mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if its inverse $F^{-1}$ has the so-called isolated calmness property at $\bar{y}$ for $\bar{x}$ . Specifically, whenever $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ there exist a constant $\mu\geq 0$ and neighborhoods $U$ of $\bar{x}$ and $V$ of $\bar{y}$ such that

[TABLE]

Moreover, the infimum of all $\mu$ such that this inclusion holds for some neighborhoods $U$ and $V$ , which we denote as $\mathop{\rm clm}\nolimits(F^{-1};\bar{y}\hskip 0.9pt|\hskip 0.9pt\bar{x})$ , equals $\,\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . The proof of this statement is straightforward, see e.g. [15, Theorem 3I.3] where it is stated in finite dimensions but can be easily translated into the language of metric spaces.

Strong subregularity and isolated calmness have been considered in various contexts and under various names in the literature. Isolated calmness was formally introduced by the second author in [9] under the name “local upper Lipschitz continuity at a point”; in the same paper the perturbation stability of this property was first proved. The equivalent property of strong subregularity was considered earlier, without giving it a name, by Rockafellar [33]. The name “strong metric subregularity” was first used in [14] where its equivalence with the isolated calmness was proved.

In finite dimensions there is a class of strongly subregular mappings with a particularly simple description. The following theorem is based on an important result by Robinson [31]:

Theorem 1.1.

Consider a mapping $F:\mathbb{R}^{n}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathbb{R}^{m}$ whose graph is the union of finitely many polyhedral convex sets. Then $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if $\bar{x}$ is an isolated point of $F^{-1}(\bar{y})$ .

The strong subregularity obeys the paradigm of the inverse function theorem, by which we mean that the property is stable (persistent) under addition of a function whose calmness constant is smaller than the reciprocal of the subregularity modulus. The metric regularity and the strong regularity also obey this paradigm but when the function added to the mapping has a Lipschitz constant smaller than the reciprocal of the regularity modulus. In the case when the mapping is represented by a strictly differentiable function this yields that all three properties are preserved under linearization.

If we fix $y=\bar{y}$ in the definition of metric regularity (1) we obtain the property of metric subregularity:

[TABLE]

In contrast to metric regularity, the property (5) does not obey the paradigm of the inverse function theorem, as explained in [15, Section 3.8]. Indeed, from Theorem 1.1 every linear mapping between $\mathbb{R}^{n}$ and $\mathbb{R}^{m}$ is metrically subregular, but not every smooth function has this property. Nevertheless, for some special kinds of mappings one may expect stability criteria in terms of infinitesimal approximations, see [21].

The following proposition puts together the strong regularity, the metric regularity, and the strong subregularity of a function $f$ at $\bar{x}$ against the invertibility, surjectivity and injectivity of its strict derivative $Df(\bar{x})$ . With some abuse of notation, for a function $f$ we say that $f$ is (strongly) metrically (sub)regular at $\bar{x}$ and write (sub) $\mathop{\rm reg}\nolimits(f;\bar{x})$ instead of (sub) $\mathop{\rm reg}\nolimits(f;\bar{x}\hskip 0.9pt|\hskip 0.9ptf(\bar{x}))$ .

Proposition 1.2.

Let $X$ and $Y$ be Banach spaces and let $f:X\to Y$ be strictly differentiable at $\bar{x}$ . Then (i) $f$ is strongly regular at $\bar{x}$ if and only if $Df(\bar{x})$ is invertible, in which case $\mathop{\rm reg}\nolimits(f;\bar{x})=\|Df(\bar{x})^{-1}\|;$ (ii) $f$ is metrically regular at $\bar{x}$ if and only if $Df(\bar{x})$ is surjective, in which case $\mathop{\rm reg}\nolimits(f;\bar{x})=\|Df(\bar{x})^{-1}\|^{-};$ (iii) Suppose that $\mathop{\rm rge}\nolimits Df(\bar{x})$ is closed. Then $f$ is strongly subregular at $\bar{x}$ if and only if $Df(\bar{x})$ is injective, in which case $\mathop{\rm subreg}\nolimits(f;\bar{x})=\|Df(\bar{x})^{-1}\|^{+}.$ Moreover, in this case it is sufficient to assume that $f$ is Fréchet differentiable at $\bar{x}$ .

The first statement is a version of the classical inverse function theorem. The second statement follows from the Lyusternik-Graves theorem. We will present a general version of the third statement in Section 2 where we also show that in infinite dimensions the assumption regarding the closedness of the range of the derivative mapping cannot be removed.

From Proposition 1.2 we obtain that if a smooth function is both strongly subregular and metrically regular at $\bar{x}$ , then it is strongly regular at $\bar{x}$ . This is not true however for set-valued mappings even if we require strong subregularity around the reference point. As a counterexample, take $F(x)=\{-x,x\},x\in\mathbb{R}$ , which is both strongly subregular and metrically regular at [math] for [math], strongly regular at every point in its graph different from the origin, and not strongly regular at [math] for [math].

In this paper we present a collection of new results regarding strong metric subregularity; we also give extended versions of known results which is clearly indicated in the text. The paper has two main parts. The first part presents theoretical results mostly related to stability of strong subregularity with respect to (derivative-type) approximations. First we focus on showing perturbation stability in general metric spaces and some consequences for differentiable functions and polyhedral mappings in finite dimensions. Then we deal with mappings of the form $f+F$ where $f$ is a not necessarily differentiable function and $F$ is a set-valued mapping. Section 4 shows extensions to the so-called strong $q$ -subregularity. In Section 5 a partial extension of Rockafellar’s criterion for strong subregularity is obtained for mappings acting in infinite-dimensional spaces. A sufficient condition for strong subregularity is established in terms of surjectivity of the Fréchet coderivative, and it is shown by a counterexample that surjectivity of the limiting coderivative cannot serve as a sufficient condition for this property to hold.

The second part of the paper is devoted to applications that are the main motivation of this study. We consider first various versions of Newton’s method including inexact and semismooth methods, for which a specific mode of convergence is shown under strong subregularity. For a standard nonlinear programming problem, a characterization of the strong subregularity of the optimality mapping is obtained in terms of a strong form of the Mangasarian-Fromovitz constraint qualification and a quadratic growth condition for the objective function. A related result is obtained in [2] for a proper lower semicontinuous convex function $g:X\to\mathbb{R}\cup\{+\infty\}$ defined on a Banach space $X$ , whose dual is denoted by $X^{*}$ . Namely, it is shown that the subdifferential mapping $\partial g:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;X^{*}$ , understood in the sense of convex analysis, is strongly subregular at a point $(\bar{x},\bar{x}^{*})\in\mathop{\rm gph}\nolimits\partial g$ if and only if there exist positive constants $\beta$ and $\delta$ such that

[TABLE]

where $\langle\cdot,\cdot\rangle:X^{*}\times X\to\mathbb{R}$ denotes the duality pairing. Generalizations of the above results to a non-convex function $g$ by using limiting subdifferential and under appropriate additional assumptions can be found in [18, Corollary 3.3 and 3.5], see also [35]. If $X=\mathbb{R}^{n}$ , a relation of strong subregularity of the limiting subdiferential and quadratic growth of a semi-algebraic function $g$ can be found in [17, Theorem 3.1].

As another application, a radius theorem for the optimality mapping for a nonlinear programming problem is proven, giving an expression for the minimal perturbation of the objective function by a quadratic form for which the second-order sufficient optimality condition is violated. Finally, an error estimate is derived for a discrete approximation in optimal control under strong subregularity of the mapping involved in the Pontryagin principle.

2 Perturbed strong subregularity

Recall [15, Section 1.3] that a function $g$ acting between metric spaces $X$ and $Y$ is said to be calm at $\bar{x}$ when $\bar{x}\in\mathop{\rm dom}\nolimits g$ and there exist a neighborhood $U$ of $\bar{x}$ and a constant $\mu\geq 0$ such that

[TABLE]

The infimum of $\mu\geq 0$ such that (6) holds for some neighborhood $U$ of $\bar{x}$ is the calmness modulus of $g$ at $\bar{x}$ and is denoted by $\mathop{\rm clm}\nolimits(g;\bar{x})$ . Note that $\bar{x}$ does not have to be an interior point of $\mathop{\rm dom}\nolimits g$ .

The following theorem shows that the strong subregularity obeys the paradigm of the inverse function theorem: the property is preserved under perturbations by a function with a small calmness modulus. A version of it appeared first in [9, Theorem 3.2] and was echoed later in other publications. More recently, [15, Theorem 3I.7] uses an equivalent definition of strong subregularity and is given in finite dimensions, while the proof in [34, Theorem 3.2] uses the notion of the steepest displacement rate of a set-valued mapping. The proof given here is just an application of the definitions; we present it for completeness.

Theorem 2.1.

Suppose that $X$ is a metric space and $Y$ is a linear metric space with shift invariant metric. Let $a$ , $\kappa$ , and $\mu$ be positive constants such that $\kappa\mu<1$ . Consider a mapping $G:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ which is strongly subregular at $\bar{x}$ for $\bar{y}$ with a constant $\kappa$ and a neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ , and a function $g:X\to Y$ which is calm at $\bar{x}$ with a constant $\mu$ and a neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ . Then $g+G$ is strongly subregular at $\bar{x}$ for $\bar{y}+g(\bar{x})$ with the constant $(\kappa^{-1}-\mu)^{-1}$ and the neighborhood ${{I\kern-3.15005ptB}_{a}(\bar{x})}$ ; in particular

[TABLE]

Proof.

By assumption, we have

[TABLE]

Observe that $\mathop{\rm dom}\nolimits(g+G)=\mathop{\rm dom}\nolimits g\cap\mathop{\rm dom}\nolimits G$ . Take any $x\in{{I\kern-3.15005ptB}_{a}(\bar{x})}\cap\mathop{\rm dom}\nolimits g$ and any $z\in g(x)+G(x)$ (if there is no such $z$ we have $d(\bar{y}+g(\bar{x}),g(x)+G(x))={+}\infty$ and there is nothing to prove). Then there exists $y\in G(x)$ such that $y=z-g(x)$ and from (7) we get

[TABLE]

Taking into account that $\kappa\mu<1$ and $z$ is an arbitrary point in $g(x)+G(x)$ , we obtain

[TABLE]

The proof is complete.

The above statement fails when the perturbation $g$ is represented by a (calm) set-valued mapping even for $X=Y=\mathbb{R}$ . Indeed, the mapping $G(x)=\{1+x^{2},2x\}$ is strongly subregular at [math] for [math]. Let $g(x)=\{-1,-x\}$ ; clearly $g$ has the isolated calmness property at [math] for [math]. However, as easily seen, the sum $g(x)+G(x)=\{x^{2},1-x+x^{2},2x-1,x\}$ is not strongly subregular at [math] for [math].

The following corollary specifies the result in Theorem 2.1 for the case when the (single-valued) function is approximated by another function.

Corollary 2.2.

Suppose that $X$ is a metric space and $Y$ is a linear metric space with shift invariant metric. Consider $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ , a point $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ and two functions $f:X\to Y$ and $h:X\to Y$ with $\bar{x}\in\mathop{\rm dom}\nolimits f\cap\mathop{\rm dom}\nolimits h$ . Suppose that $h+F$ is strongly subregular at $\bar{x}$ for $h(\bar{x})+\bar{y}$ , the difference $f-h$ is calm at $\bar{x}$ , and

[TABLE]

Then the mapping $f+F$ is strongly subregular at $\bar{x}$ for $f(\bar{x})+\bar{y}$ and

[TABLE]

In particular, if $\mathop{\rm clm}\nolimits(f-h;\bar{x})=0$ , then the mapping $f+F$ is strongly subregular at $\bar{x}$ for $f(\bar{x})+\bar{y}$ if and only if $h+F$ is strongly subregular at $\bar{x}$ for $h(\bar{x})+\bar{y}$ , in which case

[TABLE]

Proof.

To show the first statement, fix any $\kappa>\mathop{\rm subreg}\nolimits(h+F;\bar{x}{\hskip 0.9pt|\hskip 0.9pth(\bar{x})+\bar{y}})$ and $\mu>\mathop{\rm clm}\nolimits(f-h;\bar{x})$ such that $\kappa\mu<1$ . Clearly, there is $a>0$ such that the assumptions of Theorem 2.1 hold for $G=h+F$ and $g=f-h$ . Hence $f+F=g+G$ is strongly subregular at $\bar{x}$ for $f(\bar{x})+\bar{y}$ with modulus not greater than $\kappa/(1-\kappa\mu)$ . The second statement follows from the first one and the fact that $f$ and $h$ can be interchanged.

Remark 2.3.

When $X$ and $Y$ are Banach spaces and $f:X\to Y$ is Fréchet differentiable at $\bar{x}\in X$ then the function $x\mapsto h(x):=f(\bar{x})+Df(\bar{x})(x-\bar{x})$ satisfies the conditions in the second part of Corollary 2.2. Taking $F\equiv 0$ we arrive at Proposition 1.2 (iii). But we can consider the much larger class of semidifferentiable functions. Recall that a function $f:X\to Y$ is called semidifferentiable at $\bar{x}$ , if there is a (unique) continuous and positively homogeneous function $\varphi:X\to Y$ such that the function $h:=f(\bar{x})+\varphi(\cdot-\bar{x})$ is the first-order approximation to $f$ at $\bar{x}$ , that is, $\mathop{\rm clm}\nolimits(f-h;\bar{x})=0$ . Every piecewise smooth function $f:\mathbb{R}^{n}\to\mathbb{R}^{m}$ is semidifferentiable at any interior point of its domain [15, Proposition 2D.8]. Also if $f:\mathbb{R}^{n}\to\mathbb{R}^{m}$ is locally Lipschitz at $\bar{x}$ , then $f$ is semidifferentiable at $\bar{x}$ if and only if $f$ is directionally differentiable at $\bar{x}$ [15, Proposition 2D.1].

Remark 2.4.

Let $f:X\to Y$ , with $X$ and $Y$ being normed spaces, and $\bar{x}\in X$ be such that there is a positively homogeneous function $\varphi:X\to Y$ which is continuous at [math] and $\mathop{\rm clm}\nolimits(f-\varphi(\cdot-\bar{x});\bar{x})<\varepsilon$ for some positive $\varepsilon$ (such a function $\varphi$ is called the first-order $\varepsilon$ -approximation of $f$ at $\bar{x}$ in [34]). Taking $F\equiv 0$ and observing that $h:=f(\bar{x})+\varphi(\cdot-\bar{x})$ is strongly subregular at $\bar{x}$ if and only if so is $\varphi$ at [math], we get [34, Theorem 4.1]: If $\varphi$ is strongly subregular at [math] and $\varepsilon\mathop{\rm subreg}\nolimits(\varphi;0)<1$ , then $f$ is strongly subregular at $\bar{x}$ with modulus not greater than $\mathop{\rm subreg}\nolimits(\varphi;0)/(1-\varepsilon\mathop{\rm subreg}\nolimits(\varphi;0))$ .

We present next a theorem regarding perturbation stability of strong subregularity in an implicit function form. It is an infinite-dimensional version of [15, Theorem 3I.14] whose proof also works in this case with a few minor adjustments and therefore will not be reproduced here.

Theorem 2.5.

Let $X$ , $P$ and $Y$ be Banach spaces and let $f:P\times X\to Y$ and $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ . Consider the generalized equation $f(p,x)+F(x)\ni 0$ , its solution mapping $S:p\mapsto\big{\{}\,x\,\big{|}\,f(p,x)+F(x)\ni 0\big{\}},$ and a pair $(\bar{p},\bar{x})\in\mathop{\rm gph}\nolimits S$ , and suppose that $f$ is continuously Fréchet differentiable on a neighborhood of $(\bar{p},\bar{x})\in\mathop{\rm int}\nolimits\mathop{\rm dom}\nolimits f$ . If the mapping

[TABLE]

is strongly subregular at $\bar{x}$ for [math], then $S$ has the isolated calmness property at $\bar{p}$ for $\bar{x}$ with

[TABLE]

Furthermore, when $P$ and $Y$ are Hilbert spaces and $D_{p}f(\bar{p},\bar{x})$ is surjective, then the converse implication holds as well: the mapping $h+F$ is strongly subregular at $\bar{x}$ for [math] provided that $S$ has the isolated calmness property at $\bar{p}$ for $\bar{x}$ .

Proof.

The proof of the first part of the theorem which gives the estimate (8) is identical with the proof of [15, Theorem 3I.13] with general Banach space norms replacing the Euclidean ones. Consider the mapping

[TABLE]

Let $A=D_{p}f(\bar{p},\bar{x}):P\to Y$ . Since $P$ and $Y$ are Hilbert spaces, the mapping $AA^{*}:Y\to Y$ , where $A^{*}$ is the adjoint to $A$ , has a linear bounded inverse. Let $c=\|A^{*}(AA^{*})^{-1}\|$ . The further proof is identical to the proof of [15, Lemma 2C.1]. To finish, use the argument in the proof of [15, Proposition 3I.15] replacing the Euclidean norms by the norms of $X$ , $Y$ and $P$ spaces, respectively.

Generalizations of the first part of the above statement for parametric generalized equations with a nonsmooth single-valued part can be found in [34, Section 5] (cf. Theorem 3.7 in the next section). Combining Corollary 2.2 and Theorem 1.1 we obtain the following result:

Theorem 2.6.

Let $X$ and $Y$ be Banach spaces. Consider a function $f:X\to Y$ which is Fréchet differentiable at a point $\bar{x}\in X$ and a set-valued mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ . Then the mapping $f+F$ is strongly subregular at $\bar{x}$ for $f(\bar{x})+\bar{y}$ if and only if the mapping $H:=f(\bar{x})+Df(\bar{x})(\cdot-\bar{x})+F$ has the same property. In the case when $X=\mathbb{R}^{n}$ , $Y=\mathbb{R}^{m}$ and the graph of $F$ is the union of finitely many polyhedral convex sets, the mapping $H$ , and hence $f+F$ , is strongly subregular at $\bar{x}$ for $f(\bar{x})+\bar{y}$ if and only if $\bar{x}$ is an isolated point of $H^{-1}(f(\bar{x})+\bar{y})$ .

Theorem 2.6 yields the statement (iii) in Proposition 1.2 but note that the latter imposes the additional condition that the range of $Df(\bar{x})$ is closed. Indeed, $f$ is strongly subregular at $\bar{x}$ if and only if the linearization $f(\bar{x})+Df(\bar{x})(\cdot-\bar{x})$ has the same property. The problem is that an injective linear and bounded mapping is not necessarily strongly subregular. Let’s have a closer look at that.

By linearity, $A\in{\cal L}(X,Y)$ is strongly subregular everywhere if and only if $A$ is strongly subregular at [math] for [math]. From (3) we obtain that $A$ is strongly subregular at [math] for [math] if and only if

[TABLE]

If the dimension of $X$ is finite, then (9) holds if and only if $A^{-1}(0)=\{0\}$ , that is, $A$ is injective. This is not true in general as Example 2.7 shows. However, if an operator $A\in{\cal L}(X,Y)$ has a closed range then the Banach open mapping theorem yields that there is a constant $\kappa>0$ such that for any $y\in\mathop{\rm rge}\nolimits A$ there is $x\in X$ such that $y=Ax$ and $\|x\|\leq\kappa\|y\|$ . Then the injectivity of $A$ implies that such a point $x$ is unique and therefore

[TABLE]

Consequently, any bounded linear operator which is injective and has a closed range is strongly subregular at [math] for [math], and hence strongly subregular everywhere.

Example 2.7.

Let $X=\ell_{\infty}$ , the space of (infinite) sequences $\{x_{k}\}$ in $\mathbb{R}$ equipped with the norm $\|\{x_{k}\}\|_{\infty}=\sup_{k\in{\bf N}}|x_{k}|$ , and $Y=\ell_{2}$ , the space of (infinite) sequences $\{x_{k}\}$ in $\mathbb{R}$ equipped with the norm $\|\{x_{k}\}\|_{2}=\sqrt{\sum\limits_{k=1}^{+\infty}(x_{k})^{2}}$ . Define the operator $A$ by

[TABLE]

Then $A\in\mathcal{L}(\ell_{\infty},\ell_{2})$ with $\|A\|=\frac{\pi}{\sqrt{6}}$ . Indeed, letting $x_{k}:=1$ , $k\in{\bf N}$ , we get $\|\{x_{k}\}\|_{\infty}=1$ and $\|A(\{x_{k}\})\|_{2}^{2}=\|\{1/k\}\|_{2}^{2}=\sum\limits_{k=1}^{+\infty}(1/k)^{2}=\pi^{2}/6$ . On the other hand, for any $\|\{x_{k}\}\|_{\infty}\leq 1$ and $\{y_{k}\}:=A(\{x_{k}\})$ we have $0\leq(y_{k})^{2}=k^{-2}(x_{k})^{2}\leq k^{-2}$ for any $k\in{\bf N}$ , which means that $\|\{y_{k}\}\|_{2}^{2}\leq\pi^{2}/6$ . The mapping $A$ is injective, but not strongly subregular at [math] for [math]. Indeed, suppose on the contrary that there are $\kappa>0$ and $a>0$ such that

[TABLE]

Pick any $n\in{\bf N}$ such that $n>\kappa$ and then set $x_{k}=a$ if $k=n$ and $x_{k}=0$ otherwise. Then $\|\{x_{k}\}\|_{\infty}=a$ and $\|A(\{x_{k}\})\|_{2}=a/n$ . Thus

[TABLE]

a contradiction. Given $n\in{\bf N}$ , let $x_{k,n}=1$ if $k=n$ and $x_{k,n}=0$ otherwise. Then $x_{n}:=\{x_{k,n}\}\in\ell_{\infty}$ is such that $\|x_{n}\|_{\infty}=1$ and $\|Ax_{n}\|_{2}=1/n$ . Hence $\inf_{\|\{x_{k}\}\|_{\infty}=1}\|A(\{x_{k}\})\|_{2}=0$ , that is, (9) fails. The range of $A$ is not closed. Indeed, given $n\in{\bf N}$ , let $y_{k,n}=k^{-2/3}$ if $k\leq n$ and $y_{k,n}=0$ otherwise; then $y_{n}:=\{y_{k,n}\}\in\ell_{2}$ . For each $n\in{\bf N}$ , if we set $x_{k,n}=k^{1/3}$ if $k\leq n$ and $x_{k,n}=0$ otherwise, then $x_{n}:=\{x_{k,n}\}\in\ell_{\infty}$ and $Ax_{n}=y_{n}$ . Then $y:=\lim_{n\to+\infty}y_{n}=\{k^{-2/3}\}\in\ell_{2}$ but $x=A^{-1}y=\{k^{1/3}\}\notin\ell_{\infty}$ .

3 Set-valued derivative-type approximations

In this section we continue the analysis started in the preceding section of mappings of the form $f+F$ , where now $f$ is a function which is calm at the reference point but not necessarily differentiable there, and $F$ is a set-valued mapping. We will now approximate the possibly nonsmooth function $f$ around the reference point by a set $\mathcal{A}$ in ${\mathcal{L}}(X,Y)$ . This approach goes back to [23] and the concept of a prederivative which is generated by a set of linear operators.

Theorem 3.1.

Let $X$ and $Y$ be Banach spaces and consider a function $f:X\to Y$ , a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ and a point $(\bar{x},\bar{y})\in X\times Y$ such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Suppose that there exist a subset $\mathcal{A}$ of ${\mathcal{L}}(X,Y)$ and a constant $c>0$ such that: (i) there is a constant $r>0$ such that for every $u\in{I\kern-3.15005ptB}_{r}(\bar{x})$ one can find $A\in\mathcal{A}$ satisfying

[TABLE]

(ii)* for every $A\in\mathcal{A}$ the mapping*

[TABLE]

is strongly subregular at $\bar{x}$ for $\bar{y}$ and

[TABLE]

where

[TABLE]

Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ ; moreover

[TABLE]

Proof.

Note that from (10) we have $\bar{x}\in\mathop{\rm int}\nolimits\mathop{\rm dom}\nolimits f$ and also (12) yields that $m<{+}\infty$ . Choose $\kappa>m$ and $\gamma>0$ such that

[TABLE]

Let $r$ be as in condition (i). We will show first that there exists $a\in(0,r]$ such that

[TABLE]

By the definition of $\chi(\mathcal{A})$ , there is a finite set ${\mathcal{B}}\subset\mathcal{A}$ such that

[TABLE]

Pick any $\tilde{A}\in{\mathcal{B}}$ . Then there exists $\alpha_{\tilde{A}}>0$ such that

[TABLE]

Let $A^{\prime}\in(\chi(\mathcal{A})+\gamma){I\kern-3.15005ptB}$ . Since ${H}_{\tilde{A}+A^{\prime}}={H}_{\tilde{A}}+A^{\prime}(\cdot-\bar{x})$ , Theorem 2.1 implies that

[TABLE]

Thus, for any $\tilde{A}\in{\mathcal{B}}$ there is $\alpha_{\tilde{A}}>0$ such that for each $A^{\prime}\in(\chi(\mathcal{A})+\gamma){I\kern-3.15005ptB}$ the above inequality holds. Let $a=\min\left\{r,\min_{\tilde{A}\in{\mathcal{B}}}\alpha_{\tilde{A}}\right\}$ . Taking into account (15), we obtain (14).

Choose any $x\in{I\kern-3.15005ptB}_{a}(\bar{x})$ , then use (i) to find $A\in\mathcal{A}$ such that (10) is satisfied. Then (10) along with (14) gives us

[TABLE]

Since $(c+\chi(\mathcal{A})+\gamma)\kappa<1$ , we obtain

[TABLE]

Thus, $f+F$ is strongly subregular at $\bar{x}$ for ${\bar{y}}$ . Since $\kappa>m$ and $\gamma>0$ can be arbitrarily close to $m$ and [math], respectively, this yields (13).

Let $f:\mathbb{R}^{n}\to{\mathbb{R}^{m}}$ be Lipschitz continuous around $\bar{x}$ . Bouligand’s limiting Jacobian, denoted by $\partial_{B}f(\bar{x})$ , is defined as the set of all matrices obtained as limits of the usual Jacobians $\nabla f(x_{k})$ for sequences $x_{k}\to\bar{x}$ such that $f$ is differentiable at $x_{k}$ . The convex hull of $\partial_{B}f(\bar{x})$ is Clarke’s generalized Jacobian of $f$ at $\bar{x}$ denoted by $\partial_{C}f(\bar{x})$ . If in Theorem 3.1 we choose $X=\mathbb{R}^{n}$ , $Y=\mathbb{R}^{m}$ , and $\mathcal{A}:=\partial_{C}f(\bar{x})$ , then, as well known, see [15, Proposition 6F.3], for every $c>0$ there exists $r>0$ such that (10) is satisfied; that is, assumption (i) holds with an arbitrarily small $c>0$ . In that case we also have $\chi(\partial_{C}f(\bar{x}))=0$ , and then Theorem 3.1 gives us the following:

Corollary 3.2.

Let $(\bar{x},\bar{y})\in\mathbb{R}^{n}\times{\mathbb{R}^{m}}$ , $f:\mathbb{R}^{n}\to{\mathbb{R}^{m}}$ and $F:\mathbb{R}^{n}{\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;}{\mathbb{R}^{m}}$ be such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Suppose that $f$ is Lipschitz continuous around $\bar{x}$ and for every $A\in\partial_{C}f(\bar{x})$ the mapping $H_{A}$ defined in (11) is strongly subregular at $\bar{x}$ for $\bar{y}$ . Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ ; moreover,

[TABLE]

As an application of the above corollary, consider the inequality

[TABLE]

where $f:\mathbb{R}^{n}\to\mathbb{R}^{m}$ is a Lipschitz continuous function around some $\bar{x}\in\mathbb{R}^{n}$ . Inequalities in $\mathbb{R}^{m}$ are understood componentwise. Then, by combining Corollary 3.2 with Theorem 1.1, we obtain

Corollary 3.3.

In the context of the inequality system (16), suppose that for every $A\in\partial_{C}f(\bar{x})$ , the point $\bar{x}$ is the only solution of the inequality

[TABLE]

Then the mapping $f+\mathbb{R}_{+}^{m}$ is strongly subregular at $\bar{x}$ for [math].

When $F$ is the zero mapping, from Corollary 3.2 we obtain an analogue of Clarke’s inverse function theorem, which seems to be new:

Theorem 3.4.

Consider a function $f:\mathbb{R}^{n}\to\mathbb{R}^{m}$ which is Lipschitz continuous around $\bar{x}\in\mathbb{R}^{n}$ . If all matrices in the generalized Jacobian $\partial_{C}f(\bar{x})$ have rank $n$ (which is only possible if $n\leq m$ ), then $f$ is strongly subregular at $\bar{x}$ .

In a different direction, Theorem 3.1 may be extended in the following way:

Theorem 3.5.

Let $X$ and $Y$ be Banach spaces and consider a function $f:X\to Y$ , a set-valued mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ and a point $(\bar{x},\bar{y})\in X\times Y$ such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Suppose that there exist a mapping $\mathcal{H}:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathcal{L}(X,Y)$ and a constant $c>0$ such that (i) there is a constant $r>0$ along with a selection $h$ for $\mathcal{H}$ such that

[TABLE]

(ii)* the assumption (ii) in Theorem 3.1 holds with $\mathcal{A}$ replaced by $\mathcal{H}(\bar{x})$ ; (iii) for any $\varepsilon>0$ there exists $\delta>0$ such that $\mathcal{H}(x)\subset\mathcal{H}(\bar{x})+\varepsilon{I\kern-3.15005ptB}$ whenever $x\in{I\kern-3.15005ptB}_{\delta}(\bar{x})$ .

Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ with modulus satisfying (13) where $\mathcal{A}$ is replaced by $\mathcal{H}(\bar{x})$ . *

Proof.

Let $m$ and $H_{A}$ be as in Theorem 3.1 (ii) with $\mathcal{A}$ replaced by $\mathcal{H}(\bar{x})$ . Then there exists $\gamma>0$ satisfying

[TABLE]

By (iii), we may make $r$ smaller if necessary to have

[TABLE]

From the definition of measure of non-compactness, there is a finite set $\mathcal{B}\subset\mathcal{H}(\bar{x})$ such that

[TABLE]

Hence, from (19), for any $u\in{I\kern-3.15005ptB}_{r}(\bar{x})$ we get

[TABLE]

that is,

[TABLE]

This shows that the measure of non-compactness of the set $\mathcal{A}:=\mathcal{H}({I\kern-3.15005ptB}_{r}(\bar{x}))$ is not greater than $\chi(\mathcal{H}(\bar{x}))+2\gamma$ . Since $h(u)\in\mathcal{H}(u)\subset\mathcal{A}$ for each $u\in{I\kern-3.15005ptB}_{r}(\bar{x})$ the assumption (i) of Theorem 3.1 holds. By (19) we have $\mathcal{A}\subset\mathcal{H}(\bar{x})+\gamma{I\kern-3.15005ptB}$ . We will now prove that

[TABLE]

Choose any $A\in\mathcal{A}$ . Find $\bar{A}\in\mathcal{H}(\bar{x})$ such that $\|A-\bar{A}\|\leq\gamma$ . Note that, by (18), we have $\gamma m<1$ . Inasmuch as $H_{A}=H_{\bar{A}}+(A-\bar{A})(\cdot-\bar{x})$ , Corollary 2.2 implies that $\mathop{\rm subreg}\nolimits(H_{A};\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})\leq m/(1-\gamma m)$ . Since $A\in\mathcal{A}$ was arbitrarily chosen in $\mathcal{A}$ we get (20).

Remembering (18), we have that $(c+\chi(\mathcal{A}))m^{\prime}<1$ ; that is, the assumptions in (ii) of Theorem 3.1 hold with $m$ replaced by $m^{\prime}$ . Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ with modulus not greater than $m^{\prime}/(1-(c+\chi(\mathcal{A}))m^{\prime})$ . This finishes the proof of (13) with $\mathcal{A}:=\mathcal{H}(\bar{x})$ , because $\gamma>0$ can be chosen arbitrarily close to [math], which means that $\chi(\mathcal{A})$ and $m^{\prime}$ can be made arbitrarily close to $\chi(\mathcal{H}(\bar{x}))$ and $m$ , respectively.

Recall that a function $f:\mathbb{R}^{n}\to\mathbb{R}^{m}$ is said to be semismooth at $\bar{x}\in\mathbb{R}^{n}$ when it is Lipschitz continuous around $\bar{x}$ , directionally differentiable in every direction, and for every $c>0$ there exists $r>0$ such that

[TABLE]

If $f$ is semismooth at $\bar{x}$ then for any $c>0$ there is $r>0$ such that inequality (17) is satisfied with $h$ being any selection of $\partial_{B}f$ ; thus Theorem 3.5 is a subregularity version of a statement in [22]. It also yields a version of Corollary 3.2 for Bouligand’s limiting Jacobian which is known to be outer semicontinuous (at any point $\bar{x}\in\mathbb{R}^{n}$ ) [20, Proposition 7.4.11], that is, condition (iii) in Theorem 3.5 holds.

Corollary 3.6.

Let $(\bar{x},\bar{y})\in\mathbb{R}^{n}\times{\mathbb{R}^{m}}$ , $f:\mathbb{R}^{n}\to{\mathbb{R}^{m}}$ and $F:\mathbb{R}^{n}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;{\mathbb{R}^{m}}$ be such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Suppose that $f$ is Lipschitz continuous around $\bar{x}$ and that for every $c>0$ there exists $r>0$ along with a selection $h$ for $\partial_{B}f$ such that

[TABLE]

Assume that, for each $A\in\partial_{B}f(\bar{x})$ , the mapping $H_{A}$ defined in (11) is strongly subregular at $\bar{x}$ for $\bar{y}$ . Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ ; moreover,

[TABLE]

Finally, we consider a derivative-type approximation of the function $f$ by a positively homogeneous set-valued mapping.

Theorem 3.7.

Let $X$ and $Y$ be Banach spaces and consider a function $f:X\to Y$ , a set-valued mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ and a point $(\bar{x},\bar{y})\in X\times Y$ such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Suppose that there exist a positively homogeneous mapping $G:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ and a constant $c>0$ such that (i) there exists a constant $r>0$ such that

[TABLE]

(ii)* the mapping $H:=f(\bar{x})+G(\cdot-\bar{x})+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ with $\mathop{\rm subreg}\nolimits(H;\bar{x}\hskip 0.9pt|\hskip 0.9pt{\bar{y}})<1/c$ .

Then $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ ; moreover*

[TABLE]

Proof.

Let $\kappa>\mathop{\rm subreg}\nolimits(H;\bar{x}\hskip 0.9pt|\hskip 0.9pt{\bar{y}})$ be such that $c\kappa<1$ . Shrink $r$ , if necessary, to have

[TABLE]

Choose any $x\in{I\kern-3.15005ptB}_{r}(\bar{x})$ and then an arbitrary $y\in F(x)$ . By (22) we find $w\in c\|x-\bar{x}\|{I\kern-3.15005ptB}$ such that $f(x)-f(\bar{x})-w\in G(x-\bar{x})$ . Then $f(x)-w+y\in f(\bar{x})+G(x-\bar{x})+F(x)=H(x)$ and we have

[TABLE]

Therefore $(1-c\kappa)\|x-\bar{x}\|\leq\kappa\|(\bar{y}-f(x))-y)\|$ for any $y\in F(x)$ . Thus, we have

[TABLE]

Noting that $x$ was arbitrarily chosen in ${I\kern-3.15005ptB}_{r}(\bar{x})$ and $\kappa$ can be chosen arbitrarily close to $\mathop{\rm subreg}\nolimits(H;\bar{x}\hskip 0.9pt|\hskip 0.9pt{\bar{y}})$ , the proof is complete.

Taking $F\equiv 0$ the above proof gives a direct proof of [34, Theorem 4.2]. We show next that Theorem 3.7 implies Theorem 3.1.

Remark 3.8.

Let $f$ , $F$ , $(\bar{x},\bar{y})$ , $\mathcal{A}$ , $c$ , $m$ and $r$ be as in Theorem 3.1. Define $G:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ by $G(u):=\{Au\,\big{|}\,A\in\mathcal{A}\}$ , $u\in X$ . Then the condition (i) in Theorem 3.1 implies (i) in Theorem 3.7. The mapping $H$ from Theorem 3.7 (ii) has $\mathop{\rm subreg}\nolimits(H;\bar{x}\hskip 0.9pt|\hskip 0.9pt{\bar{y}})\leq m/(1-m\chi(A))=:m^{\prime}$ . Indeed, in the proof of (14) we showed that for any $\kappa>m$ and any $\gamma>0$ sufficiently close to $m$ and [math], respectively, there exists $a\in(0,r]$ such that

[TABLE]

Fix any $x\in{I\kern-3.15005ptB}_{a}(\bar{x})$ , and then pick arbitrary $v\in H(x)$ (if any). The very definition of the mapping $H$ implies that there is $A\in\mathcal{A}$ such that $v\in f(\bar{x})+A(x-\bar{x})+F(x)$ . Then

[TABLE]

Taking into account that $v$ is a fixed element of $H(x)$ , and the constants $\kappa$ and $\gamma$ can be arbitrarily close to $m$ and [math], respectively, we obtain the desired estimate for the subregularity modulus of $H$ . Inequality (12) implies that $m^{\prime}c<1$ . Therefore condition (ii) in Theorem 3.7 holds. Hence $f+F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ and

[TABLE]

A result analogous to Corollary 3.2 for strong regularity was stated in [25]; a complete proof extended to Banach spaces is given in [5]. In a more recent paper [7] a nonsmooth version of the Lyusternik-Graves theorem for metric regularity is obtained. We note that the proofs in [5] and [7] are much more involved than the proofs of Theorems 3.1 and 3.5 and use other conditions, for example, convexity of the set $\mathcal{A}$ of derivative approximations.

4 Strong $q$ -subregularity

We consider in this section an extension of the strong metric subregularity, the so-called strong metric $q$ -subregularity, defined as follows. For a positive scalar $q$ , a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ acting between metric spaces $X$ and $Y$ is said to be strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ when $(\bar{x},\bar{y})\in\mbox{gph}F$ and there exist a constant $\kappa\geq 0$ and a neighborhood $U$ of $\bar{x}$ such that

[TABLE]

The (usual) strong subregularity is obtained for $q=1$ .

Observe that for $q\neq 1$ this property is not stable under linearization, in the sense of Proposition 1.2. As a counterexample take $F(x)=x^{3}$ with $\bar{x}=0$ . However, if we consider perturbations by a function which is calm of order $1/q$ , then a simple modification of the proof of Theorem 2.1 gives us perturbation stability. Given $\gamma>0$ , a function $g:X\to Y$ is said to be $\gamma$ -calm at $\bar{x}\in\mathop{\rm dom}\nolimits g$ with the constant $\mu\geq 0$ provided that there is a neighborhood $U$ of $\bar{x}$ such that

[TABLE]

The precise result is as follows:

Theorem 4.1.

Let $X$ be a metric space and $Y$ be a linear metric space with shift invariant metric. Let $a\in(0,1]$ , $q>0$ , and $\gamma\in[1/q,+\infty)$ , and let $\kappa$ and $\mu$ be positive constants such that $\kappa\mu^{q}<1$ . Suppose that a mapping $G:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with constant $\kappa$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ . Also, consider a function $g:X\to Y$ which is $\gamma$ -calm at $\bar{x}$ with constant $\mu$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ . Then $g+G$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}+g(\bar{x})$ with constant $\kappa/(1-\kappa^{\frac{1}{q}}\mu)^{q}$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ .

Proof.

The proof repeats that of Theorem 2.1 with some adjustments of the exponents. By assumption, we have

[TABLE]

Observe that $\mathop{\rm dom}\nolimits(g+G)=\mathop{\rm dom}\nolimits g\cap\mathop{\rm dom}\nolimits G$ . Take any $x\in{I\kern-3.15005ptB}_{a}(\bar{x})\cap\mathop{\rm dom}\nolimits g$ . If $G(x)$ is empty we are done. If $G(x)\neq\emptyset$ then

[TABLE]

Since $a\leq 1$ and $\gamma\in[1/q,+\infty)$ we have $\rho(x,\bar{x})^{\gamma}\leq\rho(x,\bar{x})^{\frac{1}{q}}$ . Taking into account that $\kappa^{\frac{1}{q}}\mu<1$ , we obtain

[TABLE]

and the proof is complete.

As in the standard case with $q=1$ , when $X$ and $Y$ are Banach spaces and the perturbation is represented by a Fréchet differentiable function, we can say more about perturbation stability.

Theorem 4.2.

*Let $X$ and $Y$ are Banach spaces and let $q\geq 1$ and $(\bar{x},\bar{y})\in X\times Y$ . Consider a function $f:X\to Y$ which is Fréchet differentiable at $\bar{x}$ and a set-valued mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ such that $\bar{y}\in f(\bar{x})+F(\bar{x})$ . Then the mapping $f+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ if and only if the mapping $H:=f(\bar{x})+Df(\bar{x})(\cdot-\bar{x})+F$ has the same property.

Assume, in addition, that $f$ is Fréchet differentiable around $\bar{x}$ and $Df$ is continuous at $\bar{x}$ . Then $f+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ if and only if there are $\lambda>0$ and $a>0$ such that for any $u\in{I\kern-3.15005ptB}_{a}(\bar{x})$ the mapping $H_{u}:=f(\bar{x})+Df(u)(\cdot-\bar{x})+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with constant $\lambda$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ .*

Proof.

The Fréchet differentiability of $f$ means that the function $g:=f(\bar{x})+Df(\bar{x})(\cdot-\bar{x})-f$ has $\mathop{\rm clm}\nolimits(g;\bar{x})=0$ . Let $\kappa>0$ be such that $f+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with constant $\kappa$ . Clearly, there are $\mu>0$ and $a\in(0,1]$ such that $G:=f+F$ and $g$ satisfy the assumptions of Theorem 4.1 with $\gamma=1$ ; hence $H=G+g$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ . To prove the opposite implication, use $H$ and $-g$ as $G$ and $g$ , respectively.

Now suppose that $f$ is continuously differentiable at $\bar{x}$ . Let $\kappa>0$ and $a\in(0,1)$ be such that the mapping $G:=f+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with constant $\kappa$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ . Let $\mu>0$ be such that $\kappa\mu^{q}<1$ . Using standard calculus and making $a$ smaller, if necessary, we have that

[TABLE]

Fix any $u\in{I\kern-3.15005ptB}_{a}(\bar{x})$ . Then $g_{u}:=f(\bar{x})+Df(u)(\cdot-\bar{x})-f$ is calm at $\bar{x}$ with a constant $\mu$ and a neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ ; moreover $g(\bar{x})=0$ . Applying Theorem 4.1 with $\gamma=1$ , we get that $H_{u}=G+g_{u}$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with a constant $\lambda:=\kappa/(1-\kappa^{\frac{1}{q}}\mu)^{q}$ , which is independent of $u$ . The opposite direction follows from the first part of the statement.

We end this section with some comments regarding the recent paper [28]. Taking $\gamma=1$ and $q\geq 1$ in Theorem 4.1, one obtains [28, Theorem 4.1] where the authors use the stronger assumption that the single-valued perturbation is Lipschitz continuous around $\bar{x}$ . The first part of Theorem 4.2 slightly improves [28, Corollary 4.2] where strict differentiability of the single-valued part is assumed, while the second echoes [28, Theorem 4.4].

5 Conditions involving generalized derivatives

In this section $X$ and $Y$ are Banach spaces and $X^{*}$ and $Y^{*}$ are their duals, respectively. It follows directly from the definition that a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if its steepest displacement rate at $\bar{x}$ for $\bar{y}$ defined as

[TABLE]

is positive (with the convention that the limit in (24) is $+\infty$ when $\bar{x}$ is an isolated point in $\mathop{\rm dom}\nolimits F$ ). This notion was introduced by A. Uderzo in [34]. It is elementary to check (see [34, Proposition 2.1]) that

[TABLE]

where we set $0\cdot(+\infty)=(+\infty)\cdot 0=1$ . Thus, if $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ with a constant $\kappa>0$ then we have $|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})\geq\kappa^{-1}$ . Conversely, if $|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})>\kappa^{-1}$ for some $\kappa>0$ then $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ with the constant $\kappa$ .

When $\bar{x}$ is not an isolated point in $F^{-1}(\bar{y})$ , then $|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})=0$ . Otherwise, the steepest displacement rate (24) coincides with the subregularity constant

[TABLE]

extensively used in [26] when characterizing metric subregularity.

First, we focus on conditions based on tangential approximation of the graph of the mapping in question. Let $\Omega$ be a set in $X$ and let $\bar{x}\in\Omega$ . The Bouligand-Severi tangent cone to $\Omega$ at $\bar{x}$ , denoted by $T_{\Omega}(\bar{x})$ , is the set of all $w\in X$ such that there are sequences $\{w_{k}\}$ in $X$ and $\{t_{k}\}$ in $(0,+\infty)$ converging to $w$ and [math], respectively, such that $\bar{x}+t_{k}w_{k}\in\Omega$ for each $k\in{\bf N}$ . For a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ , the graphical derivative mapping of $F$ at $(\bar{x},\bar{y})$ is defined as

[TABLE]

The following is a generalization of [15, Theorem 4E.1] which goes back to Rockafellar [33]:

Theorem 5.1.

Consider a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ . Then

[TABLE]

If, in addition, the dimension of $X$ is finite, then

[TABLE]

*that is, $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}$ is finite.

Moreover, if both $X$ and $Y$ are finite-dimensional, then (26) holds as equality.*

Proof.

For the first part of the claim, note that if the right-hand side of (26) is infinite then we are done. If not, pick $\kappa>\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ and then $a>0$ such that

[TABLE]

Fix an arbitrary $(u,v)\in\mathop{\rm gph}\nolimits DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})=T_{\mathop{\rm gph}\nolimits F}(\bar{x},\bar{y})$ . Then there exist sequences $\{u_{k}\}$ in $X$ and $\{v_{k}\}$ in $Y$ , as well as $\{t_{k}\}$ in $(0,1)$ , converging to $u$ , $v$ , and [math], respectively, such that $\bar{y}+t_{k}v_{k}\in F(\bar{x}+t_{k}u_{k})$ for each $k\in{\bf N}$ . For $k$ sufficiently large we have $x_{k}:=\bar{x}+t_{k}u_{k}\in{I\kern-3.15005ptB}_{a}(\bar{x})$ and hence

[TABLE]

Consequently, $\|u\|\leq\kappa\|v\|$ for each $(u,v)\in\mathop{\rm gph}\nolimits DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . Thus $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}\leq\kappa$ . Letting $\kappa\downarrow\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ we get (26).

Now, let $X$ be finite-dimensional. By [15, Proposition 5A.7] we know that $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}$ is finite if and only if $DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}(0)=\{0\}$ . In view of (26), it is sufficient to prove the $\Longleftarrow$ part in the first equivalence. Let $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}$ be finite. Suppose on the contrary that $F$ is not strongly subregular at $\bar{x}$ for $\bar{y}$ . Then there is a sequence $\{(x_{k},y_{k})\}$ in $\mathop{\rm gph}\nolimits F$ converging to $(\bar{x},\bar{y})$ such that

[TABLE]

Let $t_{k}:=\|x_{k}-\bar{x}\|$ , $u_{k}:=(x_{k}-\bar{x})/t_{k}$ , and $v_{k}:=(y_{k}-\bar{y})/t_{k}$ , $k\in{\bf N}$ . By the above inequality, $t_{k}\downarrow 0$ and $v_{k}\to 0$ as $k\to+\infty$ . Since $X$ is finite-dimensional, we can assume that $\{u_{k}\}$ converges to some $u\in X$ with $\|u\|=1$ . Noting that

[TABLE]

we get that $0\in DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})(u)$ for $u\neq 0$ , that is, $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}=+\infty$ , a contradiction.

Let $Y$ be finite-dimensional as well. Suppose that (26) is strict; then there is a (positive) constant $\kappa$ such that $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}<\kappa<\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . Find a sequence $\{(x_{k},y_{k})\}$ in $\mathop{\rm gph}\nolimits F$ converging to $(\bar{x},\bar{y})$ such that

[TABLE]

Let $\{t_{k}\}$ , $\{u_{k}\}$ , and $\{v_{k}\}$ be defined as in the previous paragraph. For each $k\in{\bf N}$ , we have $t_{k}>0$ , $\|u_{k}\|=1$ , and $v_{k}\in\kappa^{-1}{I\kern-3.15005ptB}$ . Also $t_{k}\downarrow 0$ as $k\to+\infty$ . Since both $X$ and $Y$ are finite-dimensional, we can assume that $\{u_{k}\}$ converges to some $u\in X$ with $\|u\|=1$ and that $\{v_{k}\}$ converges to some $v\in\kappa^{-1}{I\kern-3.15005ptB}$ . By (27) we conclude that $v\in DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})(u)$ . Dividing (28) by $t_{k}$ and taking the limit as $k\to+\infty$ we get $\|u\|=1\geq\kappa\|v\|$ . Hence $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}\geq\kappa$ , a contradiction.

We will now consider dual space conditions for strong subregularity. Unless clearly indicated otherwise, we equip $X\times Y$ with the product (box) topology. Given a set $\Omega\subset X$ and a point $\bar{x}\in\Omega$ , the Fréchet normal cone to $\Omega$ at $\bar{x}$ , denoted by $\widehat{N}_{\Omega}(\bar{x})$ , is the set of all $x^{*}\in X^{*}$ such that for every $\varepsilon>0$ there exits $\delta>0$ such that

[TABLE]

For a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ , the Fréchet coderivative of $F$ at $(\bar{x},\bar{y})$ acts from $Y^{*}$ to the subsets of $X^{*}$ and is defined as

[TABLE]

We give next coderivative conditions for strong subregularity:

Theorem 5.2.

Consider a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ . If $X$ is finite-dimensional, then

[TABLE]

If, in addition, $\mathop{\rm gph}\nolimits F$ is locally convex at $(\bar{x},\bar{y})$ , meaning that $\mathop{\rm gph}\nolimits F\cap W$ is convex for some neighborhood $W$ of $(\bar{x},\bar{y})$ in $X\times Y$ , then (29) becomes an equality.

Proof.

If either the right-hand side of (29) is infinite or $\bar{x}$ is an isolated point of $\mathop{\rm dom}\nolimits F$ (implying that the left-hand side of (29) is zero) then we are done. Suppose that this is not the case, and fix any $\kappa>\|\widehat{D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{-}$ .

First, we show that

[TABLE]

To obtain (30), it is sufficient to show that, given $x^{*}\in X^{*}$ with $\|x^{*}\|\leq 1$ , for each $\gamma\in(0,1)$ there is a constant $\delta=\delta(x^{*},\gamma)>0$ such that

[TABLE]

Assume on the contrary that there are $x^{*}\in X^{*}$ with $\|x^{*}\|\leq 1$ and $\gamma\in(0,1)$ along with a sequence $\{x_{k}\}$ converging to $\bar{x}$ such that

[TABLE]

For each $k\in{\bf N}$ , choose a point $y_{k}\in F(x_{k})$ such that

[TABLE]

this means in particular that

[TABLE]

The choice of $\kappa$ implies that there is $y^{*}\in\widehat{D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}(x^{*})$ with $\|y^{*}\|\leq\kappa$ . Hence, we have $(x^{*},-y^{*})\in\widehat{N}_{\mathop{\rm gph}\nolimits F}(\bar{x},\bar{y})$ . Let

[TABLE]

Observe that (33) implies that $\{y_{k}\}$ converges to $\bar{y}$ and

[TABLE]

For each $k\in{\bf N}$ , using (32), we obtain

[TABLE]

Thus $(x^{*},-y^{*})\notin\widehat{N}_{\mathop{\rm gph}\nolimits F}(\bar{x},\bar{y})$ , a contradiction. We proved that (31) holds, and consequently so does (30).

Second, we show that (30) implies that $|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})\geq 1/\kappa$ . Indeed, let $\{x_{k}\}$ be any sequence in $X\setminus\{\bar{x}\}$ converging to $\bar{x}$ such that

[TABLE]

Let $u_{k}:=(x_{k}-\bar{x})/\|x_{k}-\bar{x}\|$ , $k\in{\bf N}$ . By Hahn-Banach theorem, for each $k\in{\bf N}$ , there is $u^{*}_{k}\in X^{*}$ with $\|u^{*}_{k}\|=1$ such that $\langle u^{*}_{k},u_{k}\rangle=1$ . Going to subsequences, if necessary, we may assume that $\{u_{k}\}$ converges to some $u\in X$ with $\|u\|=1$ and that $\{u_{k}^{*}\}$ converges to some $u^{*}\in X^{*}$ with $\|u^{*}\|=1$ . Then

[TABLE]

Let $x^{*}:=u^{*}/\kappa$ . Then (30) implies that

[TABLE]

By (25), we have $\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})=1/|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})\leq\kappa$ . Letting $\kappa\downarrow\|\widehat{D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{-}$ , we get (29).

Suppose now that $\mathop{\rm gph}\nolimits F$ is locally convex at $(\bar{x},\bar{y})$ . We will show the inequality opposite to (29). Fix an arbitrary $\kappa>\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ (if any). Then there is $\delta>0$ such that $\Omega:=\mathop{\rm gph}\nolimits F\cap({I\kern-3.15005ptB}_{\delta}(\bar{x})\times{I\kern-3.15005ptB}_{\delta}(\bar{y}))$ is convex and

[TABLE]

Clearly, in this case $N_{\Omega}(\bar{x},\bar{y})=\widehat{N}_{\mathop{\rm gph}\nolimits F}(\bar{x},\bar{y})$ , where $N_{\Omega}$ is the usual normal cone to $\Omega$ at $(\bar{x},\bar{y})$ in sense of convex analysis. For any $x^{*}$ from the dual ball of $X$ , we have

[TABLE]

that is, $(x^{*},0)$ is a subgradient at $(\bar{x},\bar{y})$ of the sum of two convex functions on $\Omega$ : the continuous function $\Omega\ni(x,y)\mapsto\kappa\|y-\bar{y}\|$ and the indicator function of the set $\Omega$ , which is convex but not necessarily closed. Applying the convex sum rule [29, Theorem 3.39], we get

[TABLE]

Hence for any $x^{*}\in X^{*}$ with $\|x^{*}\|\leq 1$ there is $y^{*}\in[D^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})]^{-1}(x^{*})$ with $\|y^{*}\|\leq\kappa$ . Thus $\|\widehat{D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{-}\leq\kappa$ . Letting $\kappa\downarrow\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ we get the desired inequality.

Note that inequality (29) in Theorem 5.2 may be strict rather often. For instance, if the normal cone $\widehat{N}_{\mathop{\rm gph}\nolimits F}(\bar{x},\bar{y})$ is trivial, then $\|\widehat{D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{-}=+\infty$ . Take, for example, $F:\mathbb{R}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathbb{R}$ defined by $F(x)=\{x,-x\}$ , $x\in\mathbb{R}$ . Then $\|DF(0\hskip 0.9pt|\hskip 0.9pt0)^{-1}\|^{\scriptscriptstyle+}=\mathop{\rm subreg}\nolimits(F;0\hskip 0.9pt|\hskip 0.9pt0)=1$ while $\|\widehat{D}^{*}F(0\hskip 0.9pt|\hskip 0.9pt0)^{-1}\|^{-}=+\infty$ . This particular example was also mentioned in the introduction to illustrate the differences among the regularity properties for set-valued mappings.

Suppose that $X$ is finite-dimensional. Combining Theorem 5.2 and Theorem 5.1, we get that for any $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ with $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F,$

[TABLE]

For any two positively homogeneous mappings $H_{1}$ , $H_{2}:Y\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;X$ such that $\mathop{\rm gph}\nolimits H_{1}\subset\mathop{\rm gph}\nolimits H_{2}$ we have $\|H_{2}\|^{-}\leq\|H_{1}\|^{-}$ . Hence one could expect that taking a coderivative of $F$ at $(\bar{x},\bar{y})$ based on a bigger normal cone than the Fréchet one we can achieve that its inner norm equals to $\|DF(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})^{-1}\|^{\scriptscriptstyle+}$ and, therefore to $\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ . In finite dimensions, a candidate for that to happen could be the limiting coderivative ${D}^{*}F(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y}):\mathbb{R}^{m}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathbb{R}^{n}$ with values

[TABLE]

where the limiting normal cone $N_{\Omega}(\bar{z})$ to $\Omega\subset\mathbb{R}^{d}$ at $\bar{z}\in\Omega$ is a collection of vectors $w\in\mathbb{R}^{d}$ such that there are sequences $\{w_{k}\}$ in $\mathbb{R}^{d}$ and $\{z_{k}\}$ in $\Omega$ converging to $w$ and $\bar{z}$ , respectively, such that $w_{k}\in\widehat{N}_{\Omega}(z_{k})$ for each $k\in{\bf N}$ . However, the limiting coderivative cannot provide a criterion for strong subregularity, in general. As a counterexample, let $F:\mathbb{R}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathbb{R}$ be defined by $\mathop{\rm gph}\nolimits F=\{(1/k,0):k\in{\bf N}\}\cup\{(0,0)\}$ . Then $F$ is not strongly subregular at [math] for [math] and $\|DF(0\hskip 0.9pt|\hskip 0.9pt0)^{-1}\|^{\scriptscriptstyle+}=\|\widehat{D}^{*}F(0\hskip 0.9pt|\hskip 0.9pt0)^{-1}\|^{-}=\mathop{\rm subreg}\nolimits(F;0\hskip 0.9pt|\hskip 0.9pt0)=+\infty$ , but $N_{\mathop{\rm gph}\nolimits F}(0,0)=\mathbb{R}^{2}$ which means that $\|{D}^{*}F(0\hskip 0.9pt|\hskip 0.9pt0)^{-1}\|^{-}$ is finite.

Given $\varrho>0$ , we consider an equivalent norm in the product space $X\times Y$ defined by

[TABLE]

Now we present a necessary and sufficient condition for strong subregularity similar to the statement by Fabian and Preiss [19] guaranteeing that a set-valued mapping is open with a linear rate at a reference point. Note that this statement was proved independently by Ioffe [24] who showed that it implies openness with a linear rate around the reference point.

Theorem 5.3.

Consider a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ the graph of which is locally closed at $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ . Then $|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ equals to the supremum of $\tau>0$ for which there exists $\varrho>0$ such that for any $(x,y)\in\mathop{\rm gph}\nolimits F$ with $0<\|x-\bar{x}\|<\varrho$ and $\|y-\bar{y}\|<\varrho$ , one can find a point $(u,v)\in\mathop{\rm gph}\nolimits F\setminus\{(x,y)\}$ satisfying

[TABLE]

Proof.

Denote by $s$ the supremum from the statement and let $\ell:=|F|^{\downarrow}(\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y})$ .

First, we show that $\ell\leq s$ . If $\ell=0$ , the inequality holds trivially. If not then fix any $\tau\in(0,\ell)$ . Find $\varrho\in(0,1/\tau)$ such that

[TABLE]

Fix an arbitrary $(x,y)\in\mathop{\rm gph}\nolimits F$ with $0<\|x-\bar{x}\|<\varrho$ and $\|y-\bar{y}\|<\varrho$ . Then $(u,v):=(\bar{x},\bar{y})$ is distinct from $(x,y)$ and (35) implies that

[TABLE]

Hence $y\neq\bar{y}$ . As $\tau\varrho<1$ , we have $\|y-\bar{y}\|>\tau\varrho\|y-\bar{y}\|=\tau\varrho\|v-y\|$ . Noting that $\|y-\bar{y}\|-\|v-\bar{y}\|=\|y-\bar{y}\|$ , we arrive at (34). Thus $s\geq\tau$ . The claimed inequality follows after letting $\tau\uparrow\ell$ .

To show that $\ell=s$ , assume on the contrary that $\ell<s$ . Choose $\delta\in(0,1)$ such that the set $M:=\mathop{\rm gph}\nolimits F\cap({I\kern-3.15005ptB}_{\delta}(\bar{x})\times{I\kern-3.15005ptB}_{\delta}(\bar{y}))$ is closed in $X\times Y$ . Fix any $\tau\in(\ell,s)$ and then pick $\tau^{\prime}\in(\ell,\tau)$ . Let $\varrho\in(0,\delta/2)$ be arbitrary, and set

[TABLE]

As $\tau^{\prime}>\ell$ , there is $z\in{I\kern-3.15005ptB}_{\eta}(\bar{x})$ different from $\bar{x}$ and $w\in F(z)$ such that

[TABLE]

Consider a function $(u,v)\mapsto\|v-\bar{y}\|$ on a complete metric space $(M,\|\cdot\|_{\varrho})$ . Applying to this function the Ekeland variational principle [4, Theorem 7.1.2] with

[TABLE]

we find a point $(x,y)\in M$ such that

[TABLE]

Using (36), (37), (38) and (39) we have

[TABLE]

Thus we have $0<\|x-\bar{x}\|<\varrho$ and $\|y-\bar{y}\|<\varrho$ , and, as $\varrho<1$ , also that

[TABLE]

Since (38) means that $\varepsilon/\lambda=\tau$ , from (40) we get

[TABLE]

If $(u,v)\in\mathop{\rm gph}\nolimits F\setminus M$ , then, by (41),

[TABLE]

which in combination with (39), (37), and (36) implies that

[TABLE]

Summarizing, we have shown that for every $\tau\in(\ell,s)$ and every $\varrho\in(0,\delta/2)$ there exists $(x,y)\in\mathop{\rm gph}\nolimits F$ with $0<\|x-\bar{x}\|<\varrho$ and $\|y-\bar{y}\|<\varrho$ such that no point $(u,v)\in\mathop{\rm gph}\nolimits F$ can satisfy (34). Hence $s$ cannot be strictly greater than $\ell$ , a contradiction.

We immediately get a statement characterizing strong subregularity via local and nonlocal slopes/rates of descent.

Corollary 5.4.

Consider a mapping $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ the graph of which is locally closed at $(\bar{x},\bar{y})\in\mathop{\rm gph}\nolimits F$ . Then $F$ is strongly subregular at $\bar{x}$ for $\bar{y}$ if and only if

[TABLE]

Moreover, the limit in (42) coincides with $(\mathop{\rm subreg}\nolimits(F;\bar{x}\hskip 0.9pt|\hskip 0.9pt\bar{y}))^{-1}$ .

The limit (42) is taken in the product space $X\times Y$ and involves all points $(x,y)\in\mathop{\rm gph}\nolimits F$ near $(\bar{x},\bar{y})$ excluding those with $x=\bar{x}$ (external points). At every such point a kind of (nonlocal) descent rate is computed for the distance from $y$ to $\bar{y}$ and can be underestimated by the corresponding easier to compute infinitesimal quantities:

[TABLE]

By analogy with the strong slope by De Giorgi, Marino, and Tosques [8], the quantity on the right-hand side of (43) can be interpreted as a kind of slope of $F$ at $(x,y)\in\mathop{\rm gph}\nolimits F$ (cf. [26]). It is easy to check that, when $\mathop{\rm gph}\nolimits F$ is convex, (43) holds as equality.

6 The Newton method

We study the Newton method for solving the generalized equation

[TABLE]

where both $X$ and $Y$ are Banach spaces, $f:{X}\to{Y}$ is a function, and $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ is a set-valued mapping. Provided that $f$ is Fréchet differentiable, the Newton iteration applied to (44) has the form

[TABLE]

In [15, Chapter 6] several results are presented regarding the method (45) under (strong) metric (sub)regularity. In the following subsections we extend some of these results and add new ones.

6.1 Convergence

The following theorem reveals the mode of convergence of the iteration (45) under strong subregularity of the mapping in (44). It improves [15, Theorem 6E.2].

Theorem 6.1.

Suppose that the function $f$ is Fréchet differentiable around a solution $\bar{x}$ of (44) and the derivative mapping $Df$ is continuous at $\bar{x}$ . Also suppose that the mapping $f+F$ is strongly subregular at $\bar{x}$ for [math]. Then there exists a neighborhood $O$ of $\bar{x}$ such that if a sequence $\{x_{k}\}$ is generated by the Newton method (45) and has a tail $\{x_{k}\}_{k\geq k_{0}}$ with $x_{k}\in O$ for all $k\geq k_{0}$ , then $\{x_{k}\}$ is superlinearly convergent to $\bar{x}$ .

Proof.

The continuous differentiability of $f$ implies that for each $\mu>0$ there is $\delta>0$ such that

[TABLE]

By the strong subregularity of $f+F$ , there are positive constants $\kappa$ and $a$ such that

[TABLE]

Let $\delta>0$ be such that (46) holds with $\mu:=1/(3\kappa)$ and set $O={I\kern-3.15005ptB}_{a}(\bar{x})\cap{I\kern-3.15005ptB}_{\delta}(\bar{x})$ . Let $\{x_{k}\}$ be any sequence generated by the Newton method (45) such that there is $k_{0}\in{\bf N}$ such that $x_{k}\in O$ for all $k\geq k_{0}$ . For any $k\geq k_{0}$ we have $f(x_{k+1})-f(x_{k})-Df(x_{k})(x_{k+1}-x_{k})\in f(x_{k+1})+F(x_{x+1})$ and thus

[TABLE]

Therefore $\|x_{k+1}-\bar{x}\|\leq 2^{-1}\|x_{k}-\bar{x}\|$ for each $k\geq k_{0}$ . Hence $x_{k}\to\bar{x}$ as $k\to+\infty$ . To see the rate of convergence, let $\varepsilon>0$ be arbitrary. Find $r>0$ such that ${I\kern-3.15005ptB}_{r}(\bar{x})\subset O$ and (46) holds with $\mu=\varepsilon/(\kappa(1+\varepsilon))$ and $\delta=r$ . Then there is $k_{1}\in{\bf N}$ such that $x_{k}\in{I\kern-3.15005ptB}_{r}(\bar{x})$ whenever $k>k_{1}$ . As above, for such an index $k$ , we get

[TABLE]

Therefore for any $k>k_{1}$ we have $\|x_{k+1}-\bar{x}\|\leq\varepsilon\|x_{k}-\bar{x}\|$ . Hence $x_{k}\to\bar{x}$ superlinearly.

Clearly, the theorem above can be equivalently stated with the assumption that the entire sequence $\{x_{k}\}$ belongs to $O$ ; the statement we choose adds some information which can be meaningful numerically.

Our next theorem extends the result just presented to the case of strong $q$ -regularity.

Theorem 6.2.

Assume that $Df$ is Hölder continuous around $\bar{x}$ with an exponent $\alpha\in(0,1]$ and that $f+F$ is strongly $q$ -subregular at $\bar{x}$ for $\bar{y}$ with $q\geq 1$ . Then there exists a neighborhood $O$ of $\bar{x}$ such that if a sequence $\{x_{k}\}$ is generated by the Newton method (45) and has a tail $\{x_{k}\}_{k\geq k_{0}}$ with $x_{k}\in O$ for all $k\geq k_{0}$ , then $\{x_{k}\}$ is convergent to $\bar{x}$ with convergence rate $q(1+\alpha)$ .

Proof.

The assumptions of Theorem 6.1 are satisfied, hence, for a neighborhood $O$ of $\bar{x}$ , if $\{x_{k}\}$ has a tail in $O$ , then $x_{k}\to\bar{x}$ as $k\to+\infty$ . Using standard calculus, we find $r>0$ and $L>0$ such that

[TABLE]

In view of Theorem 4.2, adjust $r$ , if necessary, and choose a constant $\lambda>0$ such that

[TABLE]

Let $N\subset{\bf N}$ be any infinite set for which $x_{k}\in{I\kern-3.15005ptB}_{r}(\bar{x})$ for all $k\in N$ . Fix $k\in N$ . Using the inclusion

[TABLE]

we obtain

[TABLE]

This gives us the desired convergence rate.

6.2 Inexact quasi-Newton method

In this subsection we consider an inexact version of the Newton method (45) for solving (44) of the form

[TABLE]

where $\{B_{k}\}$ is a sequence in ${\cal L}(X,Y)$ which represents an approximation of the derivative of $f$ provided by, for example, Broyden update, BFGS, and alike. The sequence of functions $r_{k}:X\to Y$ represents inexactness. The following theorem extends Theorem 6.1 to the iteration (48) and can be regarded as a version of the Dennis-Moré theorem for generalized equations; for related results see [10]:

Theorem 6.3.

Suppose that the function $f$ is Fréchet differentiable at a solution $\bar{x}$ of (44) and the mapping $f+F$ is strongly subregular at $\bar{x}$ for [math]. Then there exists a neighborhood $O$ of $\bar{x}$ such that if a sequence $\{x_{k}\}$ is generated by the method (48), has a tail in $O$ and also

[TABLE]

then $\{x_{k}\}$ is superlinearly convergent to $\bar{x}$ .

Proof.

By the definition of the Fréchet differentiability of $f$ at $\bar{x}$ , for each $\mu>0$ there is $\delta>0$ such that

[TABLE]

Corollary 2.2 implies that $f+F$ is strongly subregular at $\bar{x}$ for [math] if and only if so is $H:=f(\bar{x})+Df(\bar{x})(\cdot-\bar{x})+F$ , hence there are positive constants $\kappa$ and $a$ such that

[TABLE]

Let $\delta>0$ be such that (50) holds with $\mu:=1/(4\kappa)$ and set $O={I\kern-3.15005ptB}_{a}(\bar{x})\cap{I\kern-3.15005ptB}_{\delta}(\bar{x})$ . Let $\{x_{k}\}$ be any sequence generated by (48) for which there is $k_{0}\in{\bf N}$ such that $x_{k}\in O$ for all $k\geq k_{0}$ and (49) holds. Make $k_{0}$ bigger, if necessary, to have

[TABLE]

For any $k\geq k_{0}$ we have

[TABLE]

and thus the combination of (50) and (51) implies that

[TABLE]

Therefore $\|x_{k+1}-\bar{x}\|\leq(2/3)\|x_{k}-\bar{x}\|$ for each $k\geq k_{0}$ . Hence $x_{k}\to\bar{x}$ as $k\to+\infty$ . To estimate the rate of convergence, let $\varepsilon>0$ be arbitrary. Find $r>0$ such that ${I\kern-3.15005ptB}_{r}(\bar{x})\subset O$ and (50) holds with $\mu:=\varepsilon/(\kappa(2+\varepsilon))$ and $\delta:=r$ . Then there is $k_{1}\in{\bf N}$ such that $x_{k}\in{I\kern-3.15005ptB}_{r}(\bar{x})$ and

[TABLE]

As in preceding lines, for such an index $k$ we get

[TABLE]

Therefore for any $k>k_{1}$ we have $\|x_{k+1}-\bar{x}\|\leq\varepsilon\|x_{k}-\bar{x}\|$ . Hence $x_{k}\to\bar{x}$ superlinearly.

In the same way, by mimicking Theorem 6.2 one can obtain a statement analogous to Theorem 6.3 for a strongly $q$ -subregular mapping, extending a result in [28].

6.3 Semismooth Newton method

We continue our study of Newton method for solving the generalized equation (44) where $f:\mathbb{R}^{n}\to{\mathbb{R}^{m}}$ is Lipschitz continuous but not necessarily differentiable around a reference solution $\bar{x}$ . To deal with a Newton-type iteration we use the “linearization” of $f+F$ at $\bar{x}$ of the form given by the mapping (11) where the matrix $A$ is an arbitrarily chosen element of Clarke’s generalized Jacobian. We consider the following version of Newton’s iteration: given $x_{k}$ choose $A_{k}\in{\partial_{C}}f(x_{k})$ and then find $x_{k+1}$ which satisfies

[TABLE]

When the function $f$ in (44) is semismooth (see the paragraph before Corollary 3.6 for the definition), this method is usually referred to as the semismooth Newton method. Note that in the theorem below we assume that $f$ possesses the semismoothness property but do not use the directional differentiability of $f$ which appears in its definition.

Theorem 6.4.

Consider the method (52) applied to (44) with a solution $\bar{x}$ for a function $f$ which is semismooth at $\bar{x}$ and assume that for each $A\in\partial_{C}f(\bar{x})$ the mapping $H_{A}$ defined in (11) is strongly subregular at $\bar{x}$ for [math]. Then there exists a neighborhood $O$ of $\bar{x}$ such that if a sequence $\{x_{k}\}$ is generated by (52) and has a tail $\{x_{k}\}_{k\geq k_{0}}$ with $x_{k}\in O$ for all $k\geq k_{0}$ , then $\{x_{k}\}$ is superlinearly convergent to $\bar{x}$ .

Proof.

First we show that there are positive constants $\lambda$ and $a$ such that

[TABLE]

Since the set $\partial_{C}f(\bar{x})$ is compact, there exists a constant $\kappa>\sup_{A\in\partial_{C}f(\bar{x})}\mathop{\rm subreg}\nolimits(H_{A};\bar{x}\hskip 0.9pt|\hskip 0.9pt0)$ (cf. the proof of (14)). Fix any $\gamma\in(0,1/(2\kappa))$ . The mapping $\partial_{C}f$ is outer semicontinuous at $\bar{x}$ , hence there exists $r>0$ such that

[TABLE]

Compactness of the set $\partial_{C}f(\bar{x})$ implies that there is a finite set $\mathcal{A}\subset\partial_{C}f(\bar{x})$ such that $\partial_{C}f(\bar{x})\subset\mathcal{A}+\gamma{I\kern-3.15005ptB}$ . Hence

[TABLE]

Given $A\in\mathcal{A}$ there exists $\alpha_{A}\in(0,r)$ such that the mapping $H_{A}$ is strongly subregular at $\bar{x}$ for [math] with the constant $\kappa$ and neighborhood ${I\kern-3.15005ptB}_{\alpha_{A}}(\bar{x})$ . Let $a:=\min_{A\in\mathcal{A}}\alpha_{A}$ and $\lambda:=\kappa/(1-2\gamma\kappa)$ . Fix any $x$ and $A$ as in (53). As $a<r$ , using inclusion (54) we find $\bar{A}\in\mathcal{A}$ with $\|A-\bar{A}\|\leq 2\gamma$ . Therefore

[TABLE]

Since $2\gamma\kappa<1$ we get (53).

The semismoothness of $f$ implies that for each $\mu>0$ there is $\delta>0$ such that

[TABLE]

Let $\delta>0$ be such that (55) holds with $\mu=1/(2\lambda)$ and set $O={I\kern-3.15005ptB}_{a}(\bar{x})\cap{I\kern-3.15005ptB}_{\delta}(\bar{x})$ . Let $\{x_{k}\}$ be any sequence generated by (52) such that $x_{k}\in O$ for all $k\in{\bf N}$ . Fix any $k\in{\bf N}$ . As $f(\bar{x})-f(x_{k})+A_{k}(x_{k}-\bar{x})\in H_{A_{k}}(x_{k+1})$ and $A_{k}\in\partial_{C}f(x_{k})$ , using (53) and (55), we get

[TABLE]

Hence $x_{k}\to\bar{x}$ as $k\to+\infty$ . To establish the rate of convergence, let $\varepsilon>0$ be arbitrary. Find $r>0$ such that ${I\kern-3.15005ptB}_{r}(\bar{x})\subset O$ and (55) holds with $\mu=\varepsilon/\lambda$ and $\delta=r$ . Then there is $k_{1}\in{\bf N}$ such that $x_{k}\in{I\kern-3.15005ptB}_{r}(\bar{x})$ whenever $k>k_{1}$ . As above, for such an index $k$ , we get

[TABLE]

Hence $x_{k}\to\bar{x}$ superlinearly.

Remark 6.5.

In view of Corollary 3.2, the assumptions of the above theorem imply that the mapping $f+F$ is strongly subregular at $\bar{x}$ for [math].

If one considers (48) instead of (52), by using the above arguments one can obtain a slight generalization of [6, Theorem 3.2 (ii)].

6.4 Strong subregularity of Newton sequences

Denote by $\ell_{\infty}$ the space of (infinite) sequences $\{x_{k}\}$ in $X$ with elements $x_{1}$ , $x_{2}$ , $\dots$ , $x_{k}$ , $\dots$ equipped with the norm $\|\{x_{k}\}\|_{\infty}=\sup_{k\in{\bf N}}\|x_{k}\|.$ Consider the mapping

[TABLE]

that is, ${\cal S}(p,u)$ is the set of all sequences generated by the (perturbed) Newton method starting from the point $u$ . Note that if $(x,p)\in\mathop{\rm gph}\nolimits(f+F)$ , then the constant sequence $\{x\}\in{\cal S}(p,x)$ . In particular, if $\bar{x}$ is a solution of (44), then $\{\bar{x}\}\in{\cal S}(0,\bar{x})$ .

Theorem 6.6.

Suppose that $f$ is Fréchet differentiable around $\bar{x}$ and $Df$ is continuous at $\bar{x}$ . The mapping $f+F$ is strongly subregular at $\bar{x}$ for [math] if and only if there is $\lambda>0$ such that for any $\gamma\in(0,1)$ there is $a>0$ with the property that for each $\{x_{k}\}\in{I\kern-3.15005ptB}_{a}(\{\bar{x}\})$ and each $(p,u)\in{\cal S}^{-1}(\{x_{k}\})\cap(Y\times{I\kern-3.15005ptB}_{a}(\bar{x}))$ we have

[TABLE]

In this case, the infimum of such constants $\lambda$ is equal to $\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)$ .

Proof.

Denote by $i$ the infimum of $\lambda>0$ such that for any $\gamma\in(0,1)$ there is $a>0$ such that inequality (56) holds for each $\{x_{k}\}\in{I\kern-3.15005ptB}_{a}(\{\bar{x}\})$ and each $(p,u)\in{\cal S}^{-1}(\{x_{k}\})\cap(Y\times{I\kern-3.15005ptB}_{a}(\bar{x}))$ .

First, assume that $i<+\infty$ and fix any $\lambda>i$ . Pick any $\gamma\in(0,1)$ . Then there is $a>0$ such that for each $\{x_{k}\}\in{I\kern-3.15005ptB}_{a}(\{\bar{x}\})$ and each $(p,u)\in{\cal S}^{-1}(\{x_{k}\})\cap(Y\times{I\kern-3.15005ptB}_{a}(\bar{x}))$ we have

[TABLE]

Let $x\in{I\kern-3.15005ptB}_{a}(\bar{x})$ be arbitrary. Pick arbitrary $p\in f(x)+F(x)$ (if any). Then the constant sequence $\{x\}\in{\cal S}(p,x)\cap{I\kern-3.15005ptB}_{a}(\{\bar{x}\})$ , hence it satisfies (57), that is

[TABLE]

which yields

[TABLE]

As $p\in f(x)+F(x)$ was arbitrary, we conclude that $f+F$ is strongly subregular at $\bar{x}$ for [math] with the constant $\lambda/(1-\gamma)$ and neighborhood ${I\kern-3.15005ptB}_{a}(\bar{x})$ . Letting $\gamma\downarrow 0$ we get that $\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)\leq\lambda$ , and consequently $\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)\leq i$ .

Assume that $f+F$ is strongly subregular at $\bar{x}$ for [math]. Fix any $\lambda>\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)$ and any $\gamma\in(0,1)$ . Without loss of generality assume that $\gamma$ is small enough to have that $\kappa:=\lambda(1-\gamma)/(1+\gamma)>\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)$ . Find $a>0$ such that

[TABLE]

Let $\mu:=\gamma/(\kappa(1+\gamma))$ . Continuous differentiability of $f$ implies that, we can make $a$ smaller, if necessary, so that

[TABLE]

Fix any sequence $\{x_{k}\}\in{I\kern-3.15005ptB}_{a}(\{\bar{x}\})$ . Pick arbitrary $(p,u)\in\mathcal{S}^{-1}(\{x_{k}\})\cap(Y\times{I\kern-3.15005ptB}_{a}(\bar{x}))$ (if any). Note that

[TABLE]

Fix any index $k\in{\bf N}$ , then (58), (60), and (59) imply that

[TABLE]

Noting that $\gamma(1-\kappa\mu)=\kappa\mu$ and $\kappa\mu(1+\gamma)=\gamma$ , we get

[TABLE]

We claim that

[TABLE]

Indeed, as $x_{0}=u$ , (61) with $k=1$ is (62) for $k=1$ . We proceed by induction, assume that (62) holds for some $k:=k_{0}\in{\bf N}$ . This and (61) with $k=k_{0}+1$ imply that

[TABLE]

which is (62) for $k:=k_{0}+1$ . Inequality (62) is proved. Noting that $\gamma<1$ we have

[TABLE]

As $(p,u)\in{\cal S}^{-1}(\{x_{k}\})\cap(Y\times{I\kern-3.15005ptB}_{a}(\bar{x}))$ was arbitrary, the mapping ${\cal S}^{-1}$ is strongly subregular at $\{\bar{x}\}$ for $(0,\bar{x})$ and (56) holds. Clearly, $i\leq\lambda$ , hence $i\leq\mathop{\rm subreg}\nolimits(f+F;\bar{x}\hskip 0.9pt|\hskip 0.9pt0)$ .

7 Applications to optimization

7.1 Nonlinear programming

In this subsection we study strong subregularity of a mapping which plays a major role in the nonlinear programming problem

[TABLE]

subject to equality and inequality constraints:

[TABLE]

where the functions $g_{i}:\mathbb{R}^{n}\to\mathbb{R}$ , $i=0,1,\dots,m$ are twice continuously differentiable everywhere. Under a constraint qualification condition which will be specified a bit later, the first-order necessary optimality condition is represented by the Karush-Kuhn-Tucker (KKT) system

[TABLE]

where

[TABLE]

is the Lagrangian associated with the problem (63); here $y=(y_{1},\ldots,y_{m})$ is the vector of Lagrange multipliers. We study the strong subregularity of the following mapping associated with the KKT system (65):

[TABLE]

Let $(\bar{x},\bar{y})$ be a reference solution of (65). Define the index sets

[TABLE]

In further lines we utilize the following condition:

[TABLE]

This condition implies the well-known Mangasarian-Fromovitz Constraint Qualification (MFCQ) condition, in which the set $I_{2}$ is replaced by $I_{1}\cup I_{2}$ . As well known, the MFCQ yields that the set of Lagrange multipliers for problem (63) satisfying (65) is nonempty, convex and compact. The condition (67) was introduced in [27] under the name Strict Mangasarian-Fromovitz Constraint Qualification. This name however does not reflect the nature of the condition since the latter is a condition on the optimality system while MFCQ is a condition on the constraint mapping; actually, MFCQ is equivalent to the metric regularity of that mapping. Condition (67) implies that the set of Lagrange multipliers consists of a single point; we will give a proof of this claim in the proof of the next theorem.

Denote $A=\nabla^{2}_{xx}L(\bar{x},\bar{y})$ and $B=\nabla^{2}_{xy}L(\bar{x},\bar{y})$ ; that is, $B$ is the $n\times m$ matrix whose rows are the vectors $\nabla g_{i}(\bar{x}),i=1,2,\dots,m.$ Define the so-called critical cone

[TABLE]

Recall that the second-order necessary condition for local optimality has the form

[TABLE]

while the second-order sufficient condition is

[TABLE]

Now we are ready to state the main result of this subsection.

Theorem 7.1.

The following are equivalent: (i) The conditions (67) and (69) are both satisfied; (ii) The KKT mapping $T$ defined in (66) is strongly subregular at $(\bar{x},\bar{y})$ for [math] and $\bar{x}$ is a strong local minimizer of (63), meaning that there is a neighborhood $U$ of $\bar{x}$ and a constant $\beta>0$ such that

[TABLE]

where $C:=\{x\in\mathbb{R}^{n}\,\big{|}\,\ \eqref{constr}\mbox{ is satified }\}$ .

Proof.

Linearizing the functions appearing in the mapping (66) at $(\bar{x},\bar{y})$ we obtain the mapping

[TABLE]

where we take into account that $\nabla_{x}L(\bar{x},\bar{y})=0$ and $g_{i}(\bar{x})=0,i\in I_{1}\cup I_{2}$ , and use the notation

[TABLE]

in which $g_{I}$ is a vector with components $g_{i},i\in I$ . We can now apply Theorem 2.6 according to which the mapping $T$ in (66) is strongly subregular at $(\bar{x},\bar{y})$ for [math] if and only if the mapping $L$ defined in (70) has the same property. The graph of the mapping $L$ is the union of polyhedral convex sets hence the strong subregularity of $T$ is equivalent to the property that the vector $(\bar{x},\bar{y})$ is an isolated point in $L^{-1}(0)$ .

Without loss of generality suppose that $I_{1}=\{1,2,\dots,s_{1}\}$ and $I_{2}=\{s_{1}+1,\dots,s_{2}\}$ . Denote by $B_{1}$ and $B_{2}$ the submatrices of $B$ corresponding to the index sets $I_{1}$ and $I_{2}$ , respectively; that is, the rows of $B_{1}$ are the vectors $\nabla g_{i}(\bar{x}),i=1,2,\dots,s_{1}$ , and analogously for $B_{2}$ .

Let (i) hold. We will now show that $(0,0)$ is the unique solution of the variational inequality

[TABLE]

where $y_{I_{2}}$ is the subvector of $y$ whose components have indices in $I_{2}$ and $\mathbb{R}^{I_{2}}_{+}$ is the set of vectors $y_{I_{2}}$ with nonnegative components. Suppose that the mapping $T$ is not strongly subregular at $(\bar{x},\bar{y})$ for [math]. Then there is a nonzero vector $(x,y)$ satisfying (71)–(73). Assume that $x\neq 0$ . Multiplying (71) by $x$ and taking into account (72) and (73) we obtain $\langle x,Ax\rangle=0$ which contradicts (69). Hence $x=0$ . But then there exists a nonzero $y\in\mathbb{R}^{m}$ such that $B^{T}y=0$ and $0\in N_{\mathbb{R}_{+}^{I_{2}}}(y_{I_{2}})$ , hence $y_{I_{2}}\geq 0$ . This contradicts (67). Thus the mapping $T$ in (65) is strongly subregular at $(\bar{x},\bar{y})$ for [math]. It is a standard fact that when $(\bar{x},\bar{y})$ satisfies (65) and the second order sufficient condition (69), then $\bar{x}$ is a strong local solution of problem (63). Hence, (ii) is established.

In the opposite direction, suppose that the conditions in (ii) are satisfied. Then from the analysis in the beginning of the proof we conclude that the vector $(\bar{x},\bar{y})$ as an isolated point in $L^{-1}(0)$ . This in turn yields that $(0,0)$ is the unique solution of the variational inequality (71)–(73). But this immediately implies (67). Furthermore, from the assumed optimality of $\bar{x}$ the second-order necessary condition (68) holds:

[TABLE]

We only need to show that this inequality is strict. On the contrary, suppose that there exists a nonzero $x^{\prime}\in K$ such that $Ax^{\prime}=0$ . Then the nonzero vector $(x^{\prime},0)$ is a solution of (71)–(73), a contradiction. Hence the conditions in (i) are satisfied.

Theorem 7.1 partially extends [13, Theorem 2.6] with a new proof; in the latter theorem it is also shown that under the conditions in (i) there exist neighborhoods $U$ of $(\bar{x},\bar{y})$ and $V$ of [math] such that for every $v\in V$ the set $T^{-1}(v)\cap U$ is nonempty.

7.2 A radius theorem

A classical result, sometimes called the Eckart-Young theorem, says that for any nonsingular matrix $A\in\mathbb{R}^{n\times n}$ ,

[TABLE]

A far reaching generalization of this result was proved in [16], see also [15, Theorem 6A.7], for the property of metric regularity of a set-valued mapping $F$ acting between Euclidean spaces. This result was extended later in [14, Theorem 5.12], see also [15, Theorem 6A.9], to the property of strong subregularity as follows:

Theorem 7.2.

Consider a mapping $F:\mathbb{R}^{n}\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;\mathbb{R}^{m}$ which is strongly subregular at $\bar{x}$ for $\bar{y}$ . Then

[TABLE]

Moreover, the infimum remains unchanged when either taken with respect to linear mappings of rank 1 or enlarged to all functions $f$ that are calm at $\bar{x}$ , with $\|B\|$ replaced by the calmness modulus $\,\mathop{\rm clm}\nolimits(f;\bar{x})$ of $f$ at $\bar{x}$ .

Note that in Theorem 7.2 the perturbation is represented by an arbitrary linear and bounded mapping $B$ . In a number of cases, however, one should focus on mappings that have special structure. Such a situation arises in particular when one attempts to determine the “radius of good behavior” of an optimization problem. To be specific, consider the problem

[TABLE]

where $C$ is a nonempty polyhedral convex subset of $\mathbb{R}^{n}$ and $g:\mathbb{R}^{n}\to\mathbb{R}$ is twice continuously differentiable everywhere. The first-order necessary optimality condition for problem (74) has the form

[TABLE]

In the sequel the mapping $x\mapsto\nabla g(x)+N_{C}(x)$ is called the optimality mapping. Every solution of the variational inequality (75) is said to be a critical point. The critical cone at $\bar{x}$ for $-\nabla g(\bar{x})$ is defined as

[TABLE]

The second-order sufficient optimality condition for problem (74) has the form

[TABLE]

The following theorem is proved in [15, Theorem 4G.4]:

Theorem 7.3.

*Let $\bar{x}$ be a critical point for (74). Then the following are equivalent: (a) the second-order sufficient condition (76) holds at $\bar{x}$ ; (b) the point $\bar{x}$ is a local minimizer for problem (74) and the optimality mapping $\nabla g+N_{C}$ is strongly subregular at $\bar{x}$ for [math].

In either case, $\bar{x}$ is actually a strong local minimizer.*

We now apply this last result to obtain a radius theorem for problem (74). Let $\bar{x}$ be a local minimizer for (74). Along with (74) we consider the perturbed problem

[TABLE]

where $B\in\mathbb{R}^{n\times n}$ is a symmetric matrix which enters the quadratic form representing the perturbation.

Theorem 7.4.

Let $\bar{x}$ be a local minimizer for (74), let $A=\nabla^{2}g(\bar{x})$ and $K$ be the associated critical cone, and let the second-order sufficient condition (76) holds at $\bar{x}$ . Then

[TABLE]

Proof.

From Theorem 7.3 the quantity on the left side of (78) is the same as the quantity

[TABLE]

Since the strong subregularity is stable under linearization, the optimality mapping $x\mapsto\nabla g(x)+B(x-\bar{x})+N_{C}(x)$ for (77) is not strongly subregular at $\bar{x}$ for [math] exactly when the mapping $x\mapsto\nabla g(\bar{x})+(A+B)(x-\bar{x})+N_{C}(x)$ is not strongly subregular at $\bar{x}$ for [math]. Then the quantity in (79) is the same as

[TABLE]

Since the critical cone $K$ remains the same for the perturbed problem (77), by Theorem 7.3 the latter quantity equals

[TABLE]

By assumption, $A$ is symmetric positive definite on the cone $K$ , thus we have

[TABLE]

Let this minimum be attained for some $\tilde{x}$ . The matrix

[TABLE]

is symmetric (and negative definite). We have

[TABLE]

hence $A+B$ is not positive definite on $K$ . Moreover,

[TABLE]

Thus

[TABLE]

To prove the opposite inequality, observe that for any $n\times n$ matrix $B$ and any $x\in K$ , $\|x\|=1$ , we have

[TABLE]

Then

[TABLE]

provided that

[TABLE]

Thus, for any symmetric $B$ such that $\|B\|<\sigma$ , we have that $A+B$ is positive definite. Hence, $i\geq\sigma.$ Putting this together with (81) we obtain $i=\sigma$ . This proves that the quantity in (80) equals the right side of (79).

Note that when $C=\mathbb{R}^{n}$ then the right side of (79) equals the smallest eigenvalue of $A$ , which, as well known, is equal to the reciprocal of $\|A^{-1}\|$ , and we come to the finite-dimensional version of the extension of the Eckart-Young theorem described in [36]: if $A$ is symmetric positive definite, then the norm of the smallest in norm symmetric matrix $B$ such that $A+B$ is singular, equals $1/\|A^{-1}\|$ . If $C$ is a subspace, then the radius quantity becomes $1/\|(M^{T}AM)^{-1}\|$ where the columns of $M$ form a basis of $C$ .

Finally, we note that various versions of Theorem 7.3 are available in the literature as mentioned in the Introduction. Theorem 7.4 is new.

7.3 Discrete approximations in optimal control

Consider the following optimal control problem with control constraints:

[TABLE]

subject to

[TABLE]

where $\varphi:\mathbb{R}^{n+m}\to\mathbb{R}$ , $g:\mathbb{R}^{n+m}\to\mathbb{R}^{n}$ , $U$ is a closed convex set in $\mathbb{R}^{m}$ of feasible control values, $\dot{y}$ denotes the derivative of the function $t\mapsto y(t)$ with respect to time $t$ , and a.e. means almost every in the sense of Lebesgue measure. The admissible controls $u$ are functions in $L^{\infty}([0,1],\mathbb{R}^{m})$ , the space of essentially bounded and measurable functions on $[0,1]$ with values in $\mathbb{R}^{m}$ , and the state trajectories $y$ belong to $W^{1,\infty}_{0}([0,1],{\mathbb{R}^{n}})$ , the space of Lipschitz continuous functions with weak derivatives in $L^{\infty}([0,1],\mathbb{R}^{n})$ and value zero at $t=0$ . In the sequel we sometimes use the shortened notation $L^{\infty}(\mathbb{R}^{n})$ instead of $L^{\infty}([0,1],\mathbb{R}^{n})$ , etc. We assume that problem (82) has a solution $(\bar{y},\bar{u})$ and also that there exists a closed set $\Delta\subset\mathbb{R}^{n}\times\mathbb{R}^{m}$ and a $\delta>0$ with ${I\kern-3.15005ptB}_{\delta}(\bar{y}(t),\bar{u}(t))\subset\Delta$ for almost every $t\in[0,1]$ so that the functions $\varphi$ and $g$ are twice continuously differentiable in an open set containing $\Delta$ .

It is well known that under some mild conditions which we will not reproduce here, the first-order necessary condition in normal form for a weak minimum, known under the name the Pontryagin maximum principle, at a solution $(\bar{y},\bar{u})$ of problem (82) can be expressed in terms of the Hamiltonian $H(y,u,p)=\varphi(y,u)+p^{T}g(y,u)$ in the following way: there exists $\bar{p}\in W^{1,\infty}(\mathbb{R}^{n})$ , the so-called adjoint variable, such that $\bar{x}:=(\bar{y},\bar{u},\bar{p})$ is a solution of the following two-point boundary value problem coupled with a pointwise in $t$ variational inequality:

[TABLE]

for a.e. $t\in[0,1]$ where, as before, $N_{U}(u)$ is the normal cone to the set $U$ at the point $u$ . Denote $W_{1}^{1,\infty}(\mathbb{R}^{n})=\{p\in W^{1,\infty}(\mathbb{R}^{n})\mid p(1)=0\}$ , and let $X=W_{0}^{1,\infty}(\mathbb{R}^{n})\times W_{1}^{1,\infty}(\mathbb{R}^{n})\times L^{\infty}(\mathbb{R}^{m})$ and $Y=L^{\infty}(\mathbb{R}^{n})\times L^{\infty}(\mathbb{R}^{n})\times L^{\infty}(\mathbb{R}^{m})$ . Further, for $x=(y,u,p)$ let

[TABLE]

The optimality system (83) then takes the form of the generalized equation $0\in f(x)+F(x),$ where $f:X\to Y$ and $F:X\;{\lower 1.0pt\hbox{$ \rightarrow $}}\kern-12.0pt\hbox{\raise 2.8pt\hbox{$ \rightarrow $}}\;Y$ . In further lines we will show that strong subregularity of the mapping $f+F$ described by (84) for the optimality system (83) provides a basis for obtaining an error estimate for a discrete approximation to this system.

Suppose that the optimality system (83) is solved inexactly by means of a numerical method applied to a discrete approximation provided by the Euler scheme. Specifically, let $N$ be a natural number, let $h=1/N$ be the mesh spacing, and let $t_{i}=ih$ , $i\in\{0,1,\dots,N\}$ . Denote by $PL^{N}_{0}(\mathbb{R}^{n})$ the space of piecewise linear and continuous functions $y_{N}$ over the grid $\{t_{i}\}$ with values in $\mathbb{R}^{n}$ and such that $y_{N}(0)=0$ , by $PL^{N}_{1}(\mathbb{R}^{n})$ the space of piecewise linear and continuous functions $p_{N}$ over the grid $\{t_{i}\}$ with values in $\mathbb{R}^{n}$ and such that $p_{N}(1)=0$ , and by $PC^{N}(\mathbb{R}^{m})$ the space of piecewise constant and continuous from the right functions over the grid $\{t_{i}\}$ with values in $\mathbb{R}^{m}$ . Clearly, $PL^{N}_{0}(\mathbb{R}^{n})\subset W_{0}^{1,\infty}(\mathbb{R}^{n})$ , $PL^{N}_{1}(\mathbb{R}^{n})\subset W_{1}^{1,\infty}(\mathbb{R}^{n})$ and $PC^{N}(\mathbb{R}^{m})\subset L^{\infty}(\mathbb{R}^{m})$ . Then introduce the products $X^{N}=PL^{N}_{0}(\mathbb{R}^{n})\times PL^{N}_{1}(\mathbb{R}^{n})\times PC^{N}(\mathbb{R}^{m})$ as an approximation space for the triple $(y,u,p)$ . We identify $y\in PL^{N}_{0}(\mathbb{R}^{n})$ with the vector $(y^{0},\ldots,y^{N})$ of its values at the mesh points, and similarly for the adjoint variable $p$ , and $u\in PC^{N}(\mathbb{R}^{m})$ is regarded as the vector $(u^{0},\ldots,u^{N-1})$ of the values of $u$ in the mesh subintervals.

Now, suppose that, as a result of the computations, for certain natural $N$ a function $x_{N}=(y_{N},u_{N},p_{N})\in X^{N}$ is found that satisfies the discrete optimality system:

[TABLE]

for $i=0,1,\ldots,N-1$ . The system (85) represents the Euler discretization of the optimality system (83) with step-size $h=1/N$ .

Suppose that the mapping $f+F$ , where $f$ and $F$ are described in (84), is strongly subregular at $\bar{x}$ for [math]. Then there exist positive scalars $a$ and $\kappa$ such that if $x_{N}\in{I\kern-3.15005ptB}_{a}(\bar{x})$ , then

[TABLE]

where the right side of this inequality is the residual associated with the approximate solution $x_{N}$ . In our specific case, the residual can be estimated by the norm of a function $w_{N}\in Y$ defined for each $i\in\{0,1,\dots,N-1\}$ and $t\in[t_{i},t_{i+1})$ as follows:

[TABLE]

Thus, estimating the residual reduces to finding an estimate for the norm $\|w_{N}\|_{Y}$ . By the definition of the norm in $Y$ we obtain

[TABLE]

Observe that here $y_{N}$ is a piecewise linear function across the grid $\{t_{i}\}$ with uniformly bounded derivative, since both $y_{N}$ and $u_{N}$ are in some $L_{\infty}$ neighborhood of $\bar{y}$ and $\bar{u}$ respectively. Hence, taking into account that the functions $g$ , ${\nabla\!}_{y}H$ , and ${\nabla\!}_{u}H$ are continuously differentiable, this leads us to an estimate of order $O(1/N)$ for the error of the discretization. Specifically, we obtain the following result:

Theorem 7.5.

Assume that the optimality mapping $f+F$ associated with (83), where $f$ and $F$ are defined in (84), is strongly subregular at $\bar{x}=(\bar{y},\bar{u},\bar{p})$ for [math]. Then there exist $N_{0}\in{\bf N}$ and positive reals $a$ and $c$ such that if for an integer $N\geq N_{0}$ a solution $x_{N}=(y_{N},u_{N},p_{N})$ of the discrete optimality system (85) satisfies $\|x_{N}-\bar{x}\|_{X}\leq a$ then

[TABLE]

We should note that the assumption of strong subregularity of the mapping associated with (83) and considered as a mapping from $X=W_{0}^{1,\infty}\times W_{1}^{1,\infty}\times L^{\infty}$ to $Y=L^{\infty}\times L^{\infty}\times L^{\infty}$ is quite strong. For example, it follows from the estimate (86) that if the reference optimal control $\bar{u}$ has a point of discontinuity in $t$ , its piecewise constant discrete approximation $u_{N}$ must have a jump at the same point. In the paper [11], see also [12], strong regularity in $L^{\infty}$ is obtained under coercivity of the objective function, an assumption which automatically implies continuity of the optimal control $\bar{u}$ as a function of time $t$ . Without coercivity, for example, when the problem is linear in control, one needs metric regularity in larger spaces, for some new results in this direction see the recent paper [30]. In such spaces however, it may be not possible to differentiate, and hence to pass to a linearization. Theorem 7.5 should be treated as a first step towards employing strong subregularity to obtain error estimates for discrete approximations in optimal control.

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. R. Akhmerov, M. I. Kamenskii, A. S. Potapova, A. E. Rodkina, and B. N. Sadovskii, Measure of Noncompactness and Condensing Operators . Birkhäuser, Basel, 1992.
2[2] F. J. Aragón Artacho, M. H. Geoffroy, Metric subregularity of the convex subdifferential in Banach spaces. J. Nonlinear Convex Anal. 15 (2014), 35–-47.
3[3] J. M. Borwein, Stability and regular points of inequality systems. J. Optim. Th. and Appl. 48 (1986), 9–52.
4[4] J. M. Borwein, A. S. Lewis, Convex analysis and nonlinear optimization: theory and examples . Springer Science & \& Business Media, 2010.
5[5] R. Cibulka, A. L. Dontchev, A nonsmooth Robinson’s inverse function theorem in Banach spaces. Math. Program. 156 (2016), 257–270.
6[6] R. Cibulka, A. L. Dontchev, M. H. Geoffroy, Inexact Newton methods and Dennis–Moré theorems for nonsmooth generalized equations. SIAM J. Control Optim. 53 (2015), 1003–1019 .
7[7] R. Cibulka, A. L. Dontchev, V. M. Veliov, Lyusternik-Graves theorems for the sum of a Lipschitz function and a set-valued mapping. SIAM J. Control Optim. , to appear.
8[8] E. De Giorgi, A. Marino, and M. Tosques, Problems of evolution in metric spaces and maximal decreasing curve. Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 68 , (1980), 180–187, in Italian. English translation in De Giorgi, Selected papers, Springer, Heidelberg 2013, 527–533.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Abstract

1 Introduction

Theorem 1.1**.**

Proposition 1.2**.**

2 Perturbed strong subregularity

Theorem 2.1**.**

Proof.

Corollary 2.2**.**

Proof.

Remark 2.3**.**

Remark 2.4**.**

Theorem 2.5**.**

Proof.

Theorem 2.6**.**

Example 2.7**.**

3 Set-valued derivative-type approximations

Theorem 3.1**.**

Proof.

Corollary 3.2**.**

Corollary 3.3**.**

Theorem 3.4**.**

Theorem 3.5**.**

Proof.

Corollary 3.6**.**

Theorem 3.7**.**

Proof.

Remark 3.8**.**

4 Strong qqq-subregularity

Theorem 4.1**.**

Proof.

Theorem 4.2**.**

Proof.

5 Conditions involving generalized derivatives

Theorem 5.1**.**

Proof.

Theorem 5.2**.**

Proof.

Theorem 5.3**.**

Proof.

Corollary 5.4**.**

6 The Newton method

6.1 Convergence

Theorem 6.1**.**

Proof.

Theorem 6.2**.**

Proof.

6.2 Inexact quasi-Newton method

Theorem 6.3**.**

Proof.

6.3 Semismooth Newton method

Theorem 6.4**.**

Proof.

Remark 6.5**.**

6.4 Strong subregularity of Newton sequences

Theorem 6.6**.**

Proof.

7 Applications to optimization

7.1 Nonlinear programming

Theorem 7.1**.**

Proof.

7.2 A radius theorem

Theorem 7.2**.**

Theorem 7.3**.**

Theorem 7.4**.**

Proof.

7.3 Discrete approximations in optimal control

Theorem 7.5**.**

Theorem 1.1.

Proposition 1.2.

Theorem 2.1.

Corollary 2.2.

Remark 2.3.

Remark 2.4.

Theorem 2.5.

Theorem 2.6.

Example 2.7.

Theorem 3.1.

Corollary 3.2.

Corollary 3.3.

Theorem 3.4.

Theorem 3.5.

Corollary 3.6.

Theorem 3.7.

Remark 3.8.

4 Strong $q$ -subregularity

Theorem 4.1.

Theorem 4.2.

Theorem 5.1.

Theorem 5.2.

Theorem 5.3.

Corollary 5.4.

Theorem 6.1.

Theorem 6.2.

Theorem 6.3.

Theorem 6.4.

Remark 6.5.

Theorem 6.6.

Theorem 7.1.

Theorem 7.2.

Theorem 7.3.

Theorem 7.4.

Theorem 7.5.