Sufficient Conditions for Metric Subregularity of Constraint Systems   with Applications to Disjunctive and Ortho-Disjunctive Programs

Mat\'u\v{s} Benko; Michal \v{C}ervinka; Tim Hoheisel

arXiv:1906.08337·math.OC·October 26, 2020

Sufficient Conditions for Metric Subregularity of Constraint Systems with Applications to Disjunctive and Ortho-Disjunctive Programs

Mat\'u\v{s} Benko, Michal \v{C}ervinka, Tim Hoheisel

PDF

TL;DR

This paper investigates metric subregularity constraint qualification for nonconvex optimization problems, introducing directional pseudo- and quasi-normality notions, and applies these to disjunctive and ortho-disjunctive programs to develop verification tools.

Contribution

It introduces new directional normality concepts and applies them to disjunctive and ortho-disjunctive programs, enhancing the understanding and verification of metric subregularity.

Findings

01

Characterization of pseudo-normality via extremal conditions

02

Development of tools to verify MSCQ in complex programs

03

Extension of existing conditions to new classes like ortho-disjunctive programs

Abstract

This paper is devoted to the study of the metric subregularity constraint qualification (MSCQ) for general optimization problems, with the emphasis on the nonconvex setting. We elaborate on notions of directional pseudo- and quasi-normality, recently introduced by Bai et al. (SIAM J. Opt., 2019), which combine the standard approach via pseudo- and quasi-normality with modern tools of directional variational analysis. We focus on applications to disjunctive programs, where (directional) pseudo-normality is characterized via an extremal condition. This, in turn, yields efficient tools to verify pseudo-normality and MSCQ, which include, but are not limited to, Robinson's result on polyhedral multifunctions and Gfrerer's second-order sufficient condition for metric subregularity. Finally, we refine our study by defining the new class of ortho-disjunctive programs which comprises prominent…

Tables3

Table 1. (a) a = 0 > b 𝑎 0 𝑏 a=0>b

	$c = 0$	$c < 0$
$d = 0$	Polyn. $4^{t h}$ -OSC	Polyn. $4^{t h}$ -OSC
$d > 0$	Dir. $4^{t h}$ -OSC	Pseudo-normality
$d < 0$	$4^{t h}$ -OSC	$4^{t h}$ -OSC

Table 2. (a) a = 0 > b 𝑎 0 𝑏 a=0>b

	$c = 0$	$c < 0$
$d = 0$	Polyn. $4^{t h}$ -OSC	Polyn. $4^{t h}$ -OSC
$d > 0$	Dir. $4^{t h}$ -OSC	Pseudo-normality
$d < 0$	$4^{t h}$ -OSC	$4^{t h}$ -OSC

Table 3. (b) a = 0 = b 𝑎 0 𝑏 a=0=b

	$c = 0$	$c < 0$
$d = 0$	Robinson SC	Polyn. $2^{n d}$ -OSC
$d > 0$	Def.	Pseudo-normality
$d < 0$	Polyn. $4^{t h}$ -OSC	Polyn. $4^{t h}$ -OSC

Equations218

x \in R^{n} min f (x) \mbox s . t . x \in F^{- 1} (Γ) =: X,

x \in R^{n} min f (x) \mbox s . t . x \in F^{- 1} (Γ) =: X,

P_{α} := f + α d_{Γ} \circ F (α > 0),

P_{α} := f + α d_{Γ} \circ F (α > 0),

T_{C} (z) := {d \in R^{n} \exists {d^{k}} \to d, {t_{k}} ↓ 0 : z + t_{k} d^{k} \in C (k \in N)} .

T_{C} (z) := {d \in R^{n} \exists {d^{k}} \to d, {t_{k}} ↓ 0 : z + t_{k} d^{k} \in C (k \in N)} .

N_{C} (z) := {z^{*} \in R^{n} ∣ ⟨ z^{*}, d ⟩ \leq 0 (d \in T_{C} (z))} .

N_{C} (z) := {z^{*} \in R^{n} ∣ ⟨ z^{*}, d ⟩ \leq 0 (d \in T_{C} (z))} .

N_{C} (z) := {z^{*} \in R^{n} \exists {\tilde{z}^{k}} \to z^{*}, {z^{k}} \to z : z^{k} \in C, \tilde{z}^{k} \in N_{C} (z^{k}) (k \in N)} .

N_{C} (z) := {z^{*} \in R^{n} \exists {\tilde{z}^{k}} \to z^{*}, {z^{k}} \to z : z^{k} \in C, \tilde{z}^{k} \in N_{C} (z^{k}) (k \in N)} .

N_{C} (z) = N_{C} (z) = {z^{*} \in R^{n} ∣ ⟨ z^{*}, v - z ⟩ \leq 0 (v \in C)},

N_{C} (z) = N_{C} (z) = {z^{*} \in R^{n} ∣ ⟨ z^{*}, v - z ⟩ \leq 0 (v \in C)},

N_{C} (z; d) := {z^{*} \in R^{n} \exists {t_{k}} ↓ 0, {d^{k}} \to d, {\tilde{z}^{k}} \to z^{*} : \tilde{z}^{k} \in N_{C} (z + t_{k} d^{k}) (k \in N)} .

N_{C} (z; d) := {z^{*} \in R^{n} \exists {t_{k}} ↓ 0, {d^{k}} \to d, {\tilde{z}^{k}} \to z^{*} : \tilde{z}^{k} \in N_{C} (z + t_{k} d^{k}) (k \in N)} .

\hat{\partial}f(\bar{x}):=\big{\{}\xi\in\mathbb{R}^{n}\;\big{|}\;(\xi^{*},-1)\in\widehat{N}_{{\rm\small epi}\,f}\big{(}\bar{x},f(\bar{x})\big{)}\big{\}},\quad\partial f(\bar{x}):=\big{\{}\xi\in\mathbb{R}^{n}\;\big{|}\;(\xi^{*},-1)\in N_{{\rm\small epi}\,f}\big{(}\bar{x},f(\bar{x})\big{)}\big{\}}

\hat{\partial}f(\bar{x}):=\big{\{}\xi\in\mathbb{R}^{n}\;\big{|}\;(\xi^{*},-1)\in\widehat{N}_{{\rm\small epi}\,f}\big{(}\bar{x},f(\bar{x})\big{)}\big{\}},\quad\partial f(\bar{x}):=\big{\{}\xi\in\mathbb{R}^{n}\;\big{|}\;(\xi^{*},-1)\in N_{{\rm\small epi}\,f}\big{(}\bar{x},f(\bar{x})\big{)}\big{\}}

\delta_{C}:x\mapsto\left\{\begin{array}[]{ll}0&\text{if}\;x\in C,\\ +\infty&\text{else},\end{array}\right.

\delta_{C}:x\mapsto\left\{\begin{array}[]{ll}0&\text{if}\;x\in C,\\ +\infty&\text{else},\end{array}\right.

d_{X} (x) \leq κ d_{Γ} (F (x)) (x \in U) .

d_{X} (x) \leq κ d_{Γ} (F (x)) (x \in U) .

d_{M^{- 1} (y)} (x) \leq κ d_{M (x)} (y) = κ d_{Γ} (F (x) - y) ((x, y) \in U \times V) .

d_{M^{- 1} (y)} (x) \leq κ d_{M (x)} (y) = κ d_{Γ} (F (x) - y) ((x, y) \in U \times V) .

\nabla F (\overset{x}{ˉ})^{T} \overset{ˉ}{λ} = 0,

\nabla F (\overset{x}{ˉ})^{T} \overset{ˉ}{λ} = 0,

λ^{k} \in N_{Γ} (y^{k}) \mbox an d ⟨ \overset{ˉ}{λ}, F (x^{k}) - y^{k} ⟩ > 0 (k \in N);

λ^{k} \in N_{Γ} (y^{k}) \mbox an d ⟨ \overset{ˉ}{λ}, F (x^{k}) - y^{k} ⟩ > 0 (k \in N);

λ^{k} \in N_{Γ} (y^{k}) \mbox an d \overset{ˉ}{λ}_{i} (F_{i} (x^{k}) - y_{i}^{k}) > 0 if \overset{ˉ}{λ}_{i} \neq = 0 (k \in N);

λ^{k} \in N_{Γ} (y^{k}) \mbox an d \overset{ˉ}{λ}_{i} (F_{i} (x^{k}) - y_{i}^{k}) > 0 if \overset{ˉ}{λ}_{i} \neq = 0 (k \in N);

\nabla F (\overset{x}{ˉ})^{T} λ = 0, λ \in N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) ⟹ λ = 0;

\nabla F (\overset{x}{ˉ})^{T} λ = 0, λ \in N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) ⟹ λ = 0;

\nabla F (\overset{x}{ˉ})^{T} λ = 0, λ \in N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u), u^{T} \nabla^{2} ⟨ λ, F ⟩ (\overset{x}{ˉ}) u \geq 0 ⟹ λ = 0.

\nabla F (\overset{x}{ˉ})^{T} λ = 0, λ \in N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u), u^{T} \nabla^{2} ⟨ λ, F ⟩ (\overset{x}{ˉ}) u \geq 0 ⟹ λ = 0.

a_{k} : (y, λ) \mapsto (y, λ, ⟨ \overset{ˉ}{λ}, F (x^{k}) - y ⟩) (k \in N),

a_{k} : (y, λ) \mapsto (y, λ, ⟨ \overset{ˉ}{λ}, F (x^{k}) - y ⟩) (k \in N),

Λ^{0} (\overset{x}{ˉ}; u) := ker \nabla F (\overset{x}{ˉ})^{T} \cap N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) (u \in R^{n})

Λ^{0} (\overset{x}{ˉ}; u) := ker \nabla F (\overset{x}{ˉ})^{T} \cap N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) (u \in R^{n})

Λ^{0} (\overset{x}{ˉ}) := Λ^{0} (\overset{x}{ˉ}; 0) = ker \nabla F (\overset{x}{ˉ})^{T} \cap N_{Γ} (F (\overset{x}{ˉ})),

Λ^{0} (\overset{x}{ˉ}) := Λ^{0} (\overset{x}{ˉ}; 0) = ker \nabla F (\overset{x}{ˉ})^{T} \cap N_{Γ} (F (\overset{x}{ˉ})),

Λ^{0} (\overset{x}{ˉ}) = {0},

Λ^{0} (\overset{x}{ˉ}) = {0},

Λ^{0} (\overset{x}{ˉ}; u) = {0} (u : \nabla F (\overset{x}{ˉ}) u \in T_{Γ} (F (\overset{x}{ˉ}))) .

Λ^{0} (\overset{x}{ˉ}; u) = {0} (u : \nabla F (\overset{x}{ˉ}) u \in T_{Γ} (F (\overset{x}{ˉ}))) .

N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) \subset N_{Γ} (F (\overset{x}{ˉ})) (u \in R^{n}) .

N_{Γ} (F (\overset{x}{ˉ}); \nabla F (\overset{x}{ˉ}) u) \subset N_{Γ} (F (\overset{x}{ˉ})) (u \in R^{n}) .

\frac{x ^{k} - x ˉ}{∥ x ^{k} - x ˉ ∥} \to u, \frac{y ^{k} - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} \to \nabla F (\overset{x}{ˉ}) u (y^{k} \in P_{Γ} (F (x^{k}))) \mbox an d \nabla F (\overset{x}{ˉ}) u \in T_{Γ} (F (\overset{x}{ˉ})) .

\frac{x ^{k} - x ˉ}{∥ x ^{k} - x ˉ ∥} \to u, \frac{y ^{k} - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} \to \nabla F (\overset{x}{ˉ}) u (y^{k} \in P_{Γ} (F (x^{k}))) \mbox an d \nabla F (\overset{x}{ˉ}) u \in T_{Γ} (F (\overset{x}{ˉ})) .

\frac{y ^{k} - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} - \nabla F (\overset{x}{ˉ}) u \leq \frac{∥ y ^{k} - F ( x ^{k} ) ∥}{∥ x ^{k} - x ˉ ∥} + \frac{F ( x ^{k} ) - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} - \nabla F (\overset{x}{ˉ}) u (k \in N) .

\frac{y ^{k} - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} - \nabla F (\overset{x}{ˉ}) u \leq \frac{∥ y ^{k} - F ( x ^{k} ) ∥}{∥ x ^{k} - x ˉ ∥} + \frac{F ( x ^{k} ) - F ( x ˉ )}{∥ x ^{k} - x ˉ ∥} - \nabla F (\overset{x}{ˉ}) u (k \in N) .

\frac{∥ y ^{k} - F ( x ^{k} ) ∥}{∥ x ^{k} - x ˉ ∥} = \frac{d _{Γ} ( F ( x ^{k} ))}{∥ x ^{k} - x ˉ ∥} \leq \frac{1}{k} \to 0.

\frac{∥ y ^{k} - F ( x ^{k} ) ∥}{∥ x ^{k} - x ˉ ∥} = \frac{d _{Γ} ( F ( x ^{k} ))}{∥ x ^{k} - x ˉ ∥} \leq \frac{1}{k} \to 0.

(x^{k} - \overset{x}{ˉ}) / x^{k} - \overset{x}{ˉ} \to u,

(x^{k} - \overset{x}{ˉ}) / x^{k} - \overset{x}{ˉ} \to u,

λ^{k} \in N_{Γ} (y^{k}),

\partial (d_{Γ} \circ F) (x) \subset \nabla F (x)^{T} \partial d_{Γ} (F (x)) (x \in R^{n}),

\partial (d_{Γ} \circ F) (x) \subset \nabla F (x)^{T} \partial d_{Γ} (F (x)) (x \in R^{n}),

\partial d_{Γ} (F (x^{k})) = \frac{F ( x ^{k} ) - P _{Γ} ( F ( x ^{k} ))}{d _{Γ} ( F ( x ^{k} ))} (k \in N),

\partial d_{Γ} (F (x^{k})) = \frac{F ( x ^{k} ) - P _{Γ} ( F ( x ^{k} ))}{d _{Γ} ( F ( x ^{k} ))} (k \in N),

λ^{k} := \frac{F ( x ^{k} ) - y ^{k}}{d _{Γ} ( F ( x ^{k} ))}

λ^{k} := \frac{F ( x ^{k} ) - y ^{k}}{d _{Γ} ( F ( x ^{k} ))}

ξ^{k} = \nabla F (x^{k})^{T} λ^{k} \mbox an d ∥ λ^{k} ∥ = 1 (k \in N) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Sufficient conditions for metric subregularity of constraint systems with applications to disjunctive and ortho-disjunctive programs

Matúš Benko*†,‡*

,

Michal Červinka*♭,♯*

and

Tim Hoheisel⋄

Abstract.

This paper is devoted to the study of the metric subregularity constraint qualification (MSCQ) for general optimization problems, with the emphasis on the nonconvex setting. We elaborate on notions of directional pseudo- and quasi-normality, recently introduced by Bai et al. (SIAM J. Opt., 2019), which combine the standard approach via pseudo- and quasi-normality with modern tools of directional variational analysis. We focus on applications to disjunctive programs, where (directional) pseudo-normality is characterized via an extremal condition. This, in turn, yields efficient tools to verify pseudo-normality and MSCQ, which include, but are not limited to, Robinson’s result on polyhedral multifunctions and Gfrerer’s second-order sufficient condition for metric subregularity. Finally, we refine our study by defining the new class of ortho-disjunctive programs which comprises prominent optimization problems such as mathematical programs with complementarity, vanishing or switching constraints.

Key words and phrases:

metric subregularity, error bound property, pseudo-/quasi-normality, MPCC, MPVC, disjunctive programs, ortho-disjunctive programs

*†*Institute of Computational Mathematics, Johannes Kepler University Linz, A-4040 Linz, Austria, e-mail: [email protected]

*‡*Faculty of Mathematics, University of Vienna, 1090 Vienna, Austria

♭Institute of Economic Studies, Faculty of Social Sciences, Charles University, Opletalova 26, 110 00, Prague 1, Czech Republic, e-mail: [email protected]

♯Institute of Information Theory and Automation, Czech Academy of Sciences, Pod Vodarenskou vezi 4, 180 00 Prague 8, Czech Republic, e-mail: [email protected]

⋄Institute of Mathematics and Statistics, McGill University, 805 Sherbrooke St West, Room 1114 Montréal, Québec, Canada H3A 0B9

1 Introduction

In this paper we study constraint qualifications (CQs) for a general mathematical program (GMP) given by

[TABLE]

where $f:\mathbb{R}^{n}\to\mathbb{R}$ and $F:\mathbb{R}^{n}\to\mathbb{R}^{d}$ are continuously differentiable and $\Gamma\subset\mathbb{R}^{d}$ is closed. Constraint qualification are regularity conditions on the feasible set of an optimization problem and play a crucial role for stationarity and optimality conditions, sensitivity analysis or exact penalization, as well as the convergence analysis of numerical algorithms.

At the center of our attention is the metric subregularity constraint qualification (MSCQ). Known also under other monikers such as error bound property or calmness constraint qualification (in general, calmness is equivalent to metric subregularity of the inverse mapping), MSCQ is, to the best of our knowledge, the weakest known CQ to ensure the full calculus for (limiting) normal cones and subdifferentials, see [33, 43]. In particular, MSCQ guarantees that local minimizers of (1) are Mordukhovich (M)-stationary [33]. Moreover, MSCQ also yields exactness of the penalty function

[TABLE]

see, e.g., [15, 16, 18, 46], which is an important tool for establishing necessary optimality conditions, as well as for numerical methods [16].

Apart from the area of optimality conditions and exact penalization, MSCQ turns out to be essential also in second-order variational analysis and closely related areas of stability and sensitivity, cf. e.g., [8, 9, 30, 31] and references therein.

The main drawback of MSCQ is the difficulty with efficient verification of this property. There exist two main approaches to ensure MSCQ. The first one makes use of the stronger property of metric regularity, which is closely related to other concepts such as the Aubin property, (generalized) Mangasarian-Fromovitz constraint qualification (GMFCQ), no nonzero abnormal multiplier constraint qualification (NNAMCQ). Metric regularity can be characterized via co-derivatives [56, Theorem 9.40] (known as the Mordukhovich criterion) or via graphical derivatives [21, Theorem 4B.1]. For more information as to metric regularity we refer to the monographs [21, 42, 47, 52, 56].

The second approach corresponds to Robinson’s famous result on polyhedral multifunctions [55, Proposition 1] and is, in turn, restricted to this special case. There are many situations, however, in which metric subregularity is provably satisfied, yet can not be detected by either of these approaches.

Therefore, a lot of attention has been given to conditions that lie between metric regularity and metric subregularity. A very common approach is to provide sufficient conditions for subregularity/calmness in terms of various derivative-like objects [23, 33, 34, 43, 48, 54, 59]. An exception is the early and very interesting attempt by Klatte and Kummer [46, Theorem 3.6], where calmness of an intersection mapping is studied. For further details on metric subregularity and related notions as well as more bibliographical pointers on the topic, we refer to the paper and the textbook by Dontchev and Rockafellar [20, 21] and to the textbook by Ioffe [42].

We will further focus on the following two strategies: the first one is obtained by the pseudo- and quasi-normality, first introduced for nonlinear programming in [10], and later extended to MPCCs in [45, 60] as well as to general programs of the form (1), see [32]. The second one, based on the directional approach recently developed by Gfrerer and co-authors [7, 25, 26, 27, 29, 30], was established and utilized in [25, 26, 29] under the name first/second-order sufficient condition for metric subregularity (FOSCMS/SOSCMS). FOSCMS can be viewed as a directional, less restrictive counterpart of the Mordukhovich criterion. The main advantage of these conditions is their point-based nature, which makes it possible to verify them efficiently. The point-based character of these conditions can be justified by the existence of suitable calculus rules for the (directional) limiting normal cones, despite the fact that these objects are defined with information taken from the neighborhood by using a limiting process.

In this paper, we synthesize the concepts of pseudo- and quasi-normality with the above mentioned directional approach, which also serves as our main workhorse throughout the paper. Hence, we study constraint qualifications called directional pseudo-/quasi-normality, which are milder than both pseudo-/quasi-normality and FOSCMS, and imply MSCQ (cf. Theorems 3.2 and 3.5).

We would like to point out, that despite working on this combined approach independently of Bai, Ye, and Zhang [3], the exact same definitions of directional pseudo- and quasi-normality were first published in said paper. Here, we present alternative or simpler proofs of certain common results using different techniques, which can further illuminate these novel tools for the reader. Moreover, we present a thorough investigation of applicability of these new CQs, which goes beyond the material in [3].

Although the core material of our study is valid for general programs (1) with an arbitrary closed set $\Gamma$ , we are particularly interested in situations when $\Gamma$ is not convex. Optimization problems with inherently nonconvex structures induced by imposing logical or combinatorial conditions on otherwise smooth or convex data [57] has been of increasing interest in recent years. Among the prominent examples are mathematical programs with complementarity constraints (MPCCs) [49, 53], or mathematical programs with vanishing constraints (MPVCs) [37], etc. For these optimization problems there are several applications in the natural and social sciences, economics and engineering. Moreover, they are very challenging from both a theoretical and numerical perspective. More examples of such programs are discussed in Section 4, where we apply our results to disjunctive programs in which $\Gamma$ is a finite union of polyhedra. In Section 5 we introduce the new notion of ortho-disjunctive programs. Ortho-disjunctive programs are disjunctive programs with an additional product structure of $\Gamma$ which allows us to address some issues that cannot be resolved in the general disjunctive setting. Both disjunctive and ortho-disjunctive programs provide a unified framework for the above mentioned particular problem classes.

The main contributions of the paper are as follows:

•

Pseudo-normality for disjunctive programs: For disjunctive programs, we observe that pseudo-normality can be cast in a simpler way which is, in fact, a proper extension of the definition of pseudo-normality that has already been used for NLPs and MPCCs in the literature. This new definition, however, reveals an interpretation of pseudo-normality via an extremal condition, see (26), which is neither visible from the general definition for (1) nor from the specially tailored ones for NLPs and MPCCs, respectively. This extremal condition then yields efficient tools to verify pseudo-normality. Indeed, apart from recovering the Robinson’s result and the Gfrerer’s SOSCMS, employing higher-order analysis yields a variety of new milder point-based sufficient conditions for pseudo-normality and MSCQ, see Section 4.3.

•

Quasi-normality for ortho-disjunctive programs: A similar approach as the one to pseudo-normality can be made for (directional) quasi-normality if one moves from the disjunctive to even more specialized ortho-disjunctive setting, designed to utilize an underlying product structure exhibited by the standard examples of disjunctive programs (MPCCs, MPVCs). The corresponding extremal condition characterizing quasi-normality leads to a surprising connection between quasi-normality and multi-objective optimization. Again, sufficient conditions of second- or higher-order are readily available.

•

PQ-normality: In Section 3 we established the new notion of (directional) PQ-normality, which includes both pseudo- and quasi-normality as extreme cases. This unified notion puts us in a position to better understand and to exploit certain product structures for which neither quasi- nor pseudo-normality is suitable.

The rest of the paper is organized as follows. In Section 2 we present some preliminary results and notions from variational analysis as well as key results regarding constraint qualifications. Section 3 contains fundamental results of our study dealing with CQs for the general program (1). In Section 4, we study disjunctive programs and obtain full results on pseudo-normality. Section 5 deals with disjunctive programs with additional product structures often present in the problems of interest (MPCCs, MPVCs, etc.). In particular, the notion of ortho-disjunctive programs is introduced and complete results on quasi-normality are obtained.

Notation: Most of the notation used is standard: The closed ball in $\mathbb{R}^{n}$ with center at $x$ and radius $r$ is denoted by $\mathbb{B}_{r}(x)$ and we use $\mathbb{B}:=\mathbb{B}_{1}(0)$ for the closed unit ball. The extended real line is given by $\overline{\mathbb{R}}:=\mathbb{R}\cup\{\pm\infty\}$ . For $f:\mathbb{R}^{n}\to\overline{\mathbb{R}}$ its epigraph is given by $\mathrm{epi}\,f:=\left\{(x,\alpha)\in\mathbb{R}^{n}\times\mathbb{R}\,\left|\;f(x)\leq\alpha\right.\right\}$ . For a nonempty set $S\subset\mathbb{R}^{n}$ we define the (Euclidean) distance function ${\rm d}_{S}:\mathbb{R}^{n}\to\mathbb{R}$ through ${\rm d}_{S}(x):=\inf_{y\in S}\|x-y\|$ . The projection mapping $P_{S}:\mathbb{R}^{n}\rightrightarrows S$ associated with $S$ is defined by $P_{S}(x):=\mathop{{\rm argmin}}_{y\in S}\|x-y\|$ . We write $\{x_{k}\}$ for a sequence of scalars and $\{x^{k}\}$ for a sequence of vectors. For a mapping $F:\mathbb{R}^{n}\to\mathbb{R}^{m}$ its Jacobian at $\bar{x}$ is denoted by $\nabla F(\bar{x})$ . In particular, for $f$ $f:\mathbb{R}^{n}\to\mathbb{R}$ , the Jacobian $\nabla f(\bar{x})$ at $\bar{x}$ is a row vector, and we denote its Hessian at $\bar{x}$ by $\nabla^{2}f(\bar{x})$ . Moreover, for $\lambda\in\mathbb{R}^{m}$ the scalarized function $\left\langle\lambda,F\right\rangle:\mathbb{R}^{n}\to\mathbb{R}$ is given by $\left\langle\lambda,F\right\rangle(x)=\lambda^{T}F(x)$ . Note that for $u\in\mathbb{R}^{n}$ we have $\nabla\left\langle\lambda,F\right\rangle(\bar{x})^{T}u=\left\langle\lambda,\nabla F(\bar{x})u\right\rangle$ and we often use the latter notation. For a matrix $A\in\mathbb{R}^{m\times n}$ , its range or image is $\operatorname{Im}A:=\left\{Ax\,\left|\;x\in\mathbb{R}^{n}\right.\right\}$ . For some vector $v\in\mathbb{R}^{n}$ we set $\mathbb{R}_{+}v:=\left\{tv\,\left|\;t\geq 0\right.\right\}$ and $\mathbb{R}_{-}v:=\left\{tv\,\left|\;t\leq 0\right.\right\}$ .

2 Preliminaries

This section is divided into two parts. First, we introduce some basic notions from variational analysis. The second part is devoted to constraint qualifications for the general mathematical program (1).

2.1 Variational analysis

Given a closed set $C\subset\mathbb{R}^{n}$ and $z\in C$ , the tangent cone to $C$ at $z$ is defined by

[TABLE]

The regular normal cone to $C$ at $z$ is given as the polar cone of the tangent cone, i.e.

[TABLE]

The limiting normal cone to $C$ at $z$ is given by

[TABLE]

If $z\notin C$ we set $T_{C}(z):=\widehat{N}_{C}(z):=N_{C}(z):=\emptyset$ . Observe that $\widehat{N}_{C}(z)\subset N_{C}(z)$ holds. In case $C$ is a convex set, regular and limiting normal cone coincide with the classical normal cone of convex analysis, i.e.,

[TABLE]

and we will use the notation $N_{C}(z)$ in this case. Finally, given a direction $d\in\mathbb{R}^{n}$ , the limiting normal cone to $C$ at $z$ in direction $d$ is defined by

[TABLE]

Note that, by definition, we have $N_{C}(z;0)=N_{C}(z)$ . Furthermore, observe that $N_{C}(z;d)\subset N_{C}(z)$ for all $d\in\mathbb{R}^{n}$ and $N_{C}(z;d)=\emptyset$ if $d\notin T_{C}(z)$ .

For $f:\mathbb{R}^{n}\to\overline{\mathbb{R}}$ and $\bar{x}$ such that $f(\bar{x})$ is finite (hence $(\bar{x},f(\bar{x}))\in\mathrm{epi}\,f)$ ) the sets

[TABLE]

denote the regular and limiting subdifferential of $f$ at $\bar{x}$ , respectively. Observe that, in particular, for the indicator function of a set $C\in\mathbb{R}^{n}$ , given by

[TABLE]

we have $\hat{\partial}\delta_{C}=N_{C}\ \mbox{ and }\ \partial\delta_{C}=N_{C}.$ The distance function enjoys a rich subdifferential calculus briefly summarized in the next result.

Proposition 2.1 (Subdifferentiation of distance function).

Let $S\subset\mathbb{R}^{d}$ be closed and $F:\mathbb{R}^{n}\to\mathbb{R}^{d}$ continuously differentiable. Then the following hold:

(i)

(**[56, Example 8.53]**) $\partial{\rm d}_{S}(y)=\left\{\begin{array}[]{ll}N_{S}(y)\cap\mathbb{B}&\text{if}\;y\in S,\\ \frac{y-P_{S}(y)}{{\rm d}_{S}(y)}&\text{if}\;y\notin S;\end{array}\right.$

(ii)

(**[56*, Theorem 10.6]**) $\partial({\rm d}_{S}\circ F)(x)\subset\nabla F(x)^{T}\partial{\rm d}_{S}(F(x)).$ *

2.2 Constraint qualifications

The purpose of this section is to recall several well-established CQs for the general program (1) and to highlight some basic relations between them. We commence with the CQ that is most important to our study.

Definition 2.2 (MSCQ).

Let $\bar{x}$ be feasible for (1). We say that the metric subregularity constraint qualification (MSCQ) holds at $\bar{x}$ if there exists a neighborhood $U$ of $\bar{x}$ and $\kappa>0$ such that

[TABLE]

Note that MSCQ is exactly metric subregularity in the set-valued sense of the feasibility mapping for the constraint system $\mathcal{X}=F^{-1}(\Gamma)$ which is given by $M(x):=F(x)-\Gamma$ , see e.g. [29].

The stronger property of metric regularity holds for $M$ around $(\bar{x},0)$ if and only if there are neighborhoods $U$ of $\bar{x}$ and $V$ of [math] and $\kappa>0$ such that

[TABLE]

It is well-known that metric regularity of a multifunction is equivalent to the Aubin property of the inverse multifunction [56, Theorem 9.43]. Applying the Mordukhovich criterion to the feasibility mapping $M$ yields a condition that there is no nonzero multiplier $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}))$ such that

[TABLE]

which is often known as generalized Mangasarian-Fromovitz constraint qualification (GMFCQ) at $\bar{x}$ . In the rest of the paper, we mainly stick to the GMFCQ terminology, but sometimes refer to GMFCQ also as the Mordukhovich criterion. Thanks to the calculus rules for limiting normal cones and subdifferentials, the Mordukhovich criterion often provides an efficient tool for verifying metric regularity. There are still plenty of situations, however, where GMFCQ is not fulfilled but MSCQ is. It is therefore an important and worthwhile endeavor to fill the gap between GMFCQ and MSCQ, ideally with verifiable conditions at that. Let us proceed with the next list of constraint qualifications for (1) relevant for our study, see, e.g., [32, 29].

Definition 2.3 (Constraint qualifications).

Let $\bar{x}\in\mathcal{X}$ be feasible for (1).We say that

(i)

pseudo-normality holds at $\bar{x}$ if there is no nonzero $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}))$ such that (4) holds and that satisfies the following condition: There exists a sequence $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ with

[TABLE]

(ii)

quasi-normality* holds at $\bar{x}$ if there is no nonzero $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}))$ such that (4) holds and that satisfies the following condition: There exists a sequence $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ with*

[TABLE]

(iii)

first-order sufficient condition for metric subregularity (FOSCMS) holds at $\bar{x}$ if for every $0\neq u\in\mathbb{R}^{n}$ with $\nabla F(\bar{x})u\in T_{\Gamma}(F(\bar{x}))$ one has

[TABLE]

(iv)

second-order sufficient condition for metric subregularity (SOSCMS)* holds at $\bar{x}$ if $F$ is twice differentiable at $\bar{x}$ , $\Gamma$ is the union of finitely many convex polyhedra, and for every $0\neq u\in\mathbb{R}^{n}$ with $\nabla F(\bar{x})u\in T_{\Gamma}(F(\bar{x}))$ one has*

[TABLE]

We point out that imposing that the (nonexisting) multiplier $\bar{\lambda}$ is in $N_{\Gamma}(F(\bar{x}))$ in the definition of pseudo-/quasi-normality is, clearly, redundant, since it follows from $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ . Nevertheless, in order to be consistent with the literature and to emphasize the connection to GMFCQ and other CQs, we stick to the original definition. In particular, it is obvious from the definition that GMFCQ implies both pseudo- and hence quasi-normality. The concepts of pseudo- and quasi-normality are well-established in the literature. Note that in [32], the condition $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ in (i) and (ii) is replaced by $\lambda^{k}\in N_{\Gamma}(y^{k})$ . In order to see that no difference arises, consider the following elementary lemma which follows readily from the definitions of continuity and of the limiting normal cone, respectively.

Lemma 2.4.

Let $\Gamma\subset\mathbb{R}^{d}$ be closed, $y\in\Gamma,\lambda\in N_{\Gamma}(y)$ and let $a:\mathbb{R}^{d}\times\mathbb{R}^{d}\to\mathbb{R}^{q}$ be continuous. Then for every $\varepsilon>0$ there exist $\tilde{y}\in\Gamma$ and $\tilde{\lambda}\in\widehat{N}_{\Gamma}(\tilde{y})$ such that $\left\|a(\tilde{y},\tilde{\lambda})-a(y,\lambda)\right\|<\epsilon$ .

Corollary 2.5.

Under the assumptions of Definition 2.3 let $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ . Then the following hold:

(i)

If $\lambda^{k}\in N_{\Gamma}(y^{k})$ and $\left\langle\bar{\lambda},F(x^{k})-y^{k}\right\rangle>0$ for all $k\in\mathbb{N}$ then there exists $\{(\tilde{y}^{k},\tilde{\lambda}^{k})\}\to(F(\bar{x}),\bar{\lambda})$ such that $\tilde{\lambda}^{k}\in\widehat{N}_{\Gamma}(\tilde{y}^{k})$ and $\left\langle\bar{\lambda},F(x^{k})-\tilde{y}^{k}\right\rangle>0$ for all $k\in\mathbb{N}$ .

(ii)

If $\lambda^{k}\in N_{\Gamma}(y^{k})$ and $\bar{\lambda}_{i}(F_{i}(x^{k})-y_{i}^{k})>0\;(i:\bar{\lambda}_{i}\neq 0)$ for all $k\in\mathbb{N}$ then there exists $\{(\tilde{y}^{k},\tilde{\lambda}^{k})\}\to(F(\bar{x}),\bar{\lambda})$ such that $\tilde{\lambda}^{k}\in\widehat{N}_{\Gamma}(\tilde{y}^{k})$ and $\bar{\lambda}_{i}(F_{i}(x^{k})-\tilde{y}_{i}^{k})>0\;(i:\bar{\lambda}_{i}\neq 0)$ for all $k\in\mathbb{N}$ .

Proof.

We only prove part (i); part (ii) can be shown analogously. To this end, define the continuous maps

[TABLE]

and set $\epsilon_{k}:=\min\left\{\frac{1}{k},\frac{1}{2}\left\langle\bar{\lambda},F(x^{k})-y^{k}\right\rangle\right\}$ . Applying Lemma 2.4 then generates the desired sequences. ∎

Corollary 2.5 guarantees that using $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ instead of $\lambda^{k}\in N_{\Gamma}(y^{k})$ in the definition of pseudo- and quasi-normality does not matter. We note that this is also true for the directional versions of these CQs to be established in Definition 3.4.

The obvious drawback of pseudo- and quasi-normality is that they are expressed via sequences, which makes it is quite difficult to check their validity and apply them. Another way of relaxing GMFCQ is provided by FOSCMS and SOSCMS, which can be easier to verify due to the calculus for directional limiting objects [7].

In order to simplify the notation, given $\bar{x}$ feasible for (1), we define

[TABLE]

and set

[TABLE]

i.e., the directional normal cone is replaced by the standard one. With these conventions, GMFCQ at $\bar{x}$ reads

[TABLE]

while FOSCMS now reads

[TABLE]

The fact that GMFCQ implies FOSCMS is clear from the inclusion

[TABLE]

The following example shows that this implication can be strict. In addition, it also illustrates that MSCQ is strictly weaker than quasi-normality, cf. Proposition 2.7(i).

Example 2.6.

Let $\Gamma:=\{y\in\mathbb{R}^{2}\,|\,y_{2}\geq|y_{1}|\}\subset\mathbb{R}^{2},\;F:\mathbb{R}\to\mathbb{R}^{2},\;F(x):=(x,-x^{2})^{T}$ and set $\bar{x}:=0$ . Clearly $\nabla F(\bar{x})=(1,0)^{T}$ and $N_{\Gamma}(F(\bar{x}))=\{y\in\mathbb{R}^{2}\,|\,y_{2}\leq-|y_{1}|\}$ , hence $0\neq\lambda:=(0,-1)^{T}\in\Lambda^{0}(\bar{x})$ and the Mordukhovich criterion (GMFCQ) is violated at $\bar{x}$ .

Moreover, setting $x_{k}:=1/k$ , $y^{k}:=F(\bar{x})=(0,0)^{T}$ and $\lambda^{k}:=\lambda=(0,-1)^{T}$ we obtain $\lambda_{2}(F_{2}(x_{k})-y^{k}_{2})=-1(-1/(k^{2}))>0$ , showing that also quasi-normality is violated at $\bar{x}$ .

On the other hand, since $N_{\Gamma}(F(\bar{x});\nabla F(\bar{x})u)=\emptyset$ for all $u\neq 0$ , FOSCMS and hence MSCQ are satisfied at $\bar{x}$ .

We point out that the set $\Gamma$ in Example 2.6 is convex, thus illustrating that even in the convex case one may not be able to verify MSCQ using the non-directional conditions (GMFCQ, pseudo- and quasi-normality), but one may invoke a directional one (here FOSCMS).

Although the directional conditions FOSCMS and SOSCMS are similar in flavor, we point out that SOSCMS is only applicable in the case where $\Gamma$ has disjunctive structure. In this setting, there is yet another condition due to Robinson [55] that ensures MSCQ.

The following proposition summarizes several important sufficient conditions for MSCQ, other than GMFCQ, which have already been established in the literature and that are important to our study. We point out, however, that the validity of these results will be a simple corollary of our refined analysis in Section 3.

Proposition 2.7 (Sufficient conditions for MSCQ).

Let $\bar{x}$ be feasible for (1). Then under either of the following conditions MSCQ holds at $\bar{x}$ .

(i)

(**[32, Theorem 5.2]**)quasi-normality (or pseudo-normality) holds at $\bar{x}$ ;

(ii)

(**[29, Corollary 1]**) FOSCMS holds at $\bar{x}$ ;

(iii)

(**[29, Corollary 1]**) SOSCMS holds at $\bar{x}$ ;

(iv)

(**[55, Proposition 1]**) $F$ is affine and $\Gamma$ is the union of finitely many convex polyhedra.

As we can see, two of these conditions are applicable for the general program (1) and are strictly milder than GMFCQ. The other two are restricted to the special structure of disjunctive constraints and hence are in general not comparable with GMFCQ. Interestingly, all four conditions are mutually incomparable and were obtained by different approaches. The only available comparison is for the disjunctive constraints, where FOSCMS clearly implies SOSCMS.

We will refer to (iv) in Proposition 2.7 as Robinson’s result. We point out, however, that [55, Proposition 1] in fact contains a stronger statement.

3 New constraint qualifications for GMP

In this section we are primarily concerned with constraint qualifications for the general mathematical program (1). In particular, we investigate directional counterparts of pseudo- and quasi-normality introduced in [3], and introduce a new CQ called PQ-normality that unifies pseudo- and quasi-normality. We then show that each of these CQs implies MSCQ, and hence recover statements (i) and (ii) of Proposition 2.7. Afterwards, we propose various sufficient conditions for these CQs under some additional structural assumptions. Hence, when applied to the disjunctive constraints in Section 4, these conditions also recover statements (iii) and (iv) of Proposition 2.7.

3.1 Directional constraint qualifications and PQ-normality

In [3, Corollary 4.1], it was shown that metric subregularity is implied by directional quasi-normality, see Definition 3.4. Here, we propose a different proof that follows the techniques used, e.g., in [45, Lemma 4.4] and [32, Lemma 5.1.]. Note that our main tools are the Ekeland’s variational principle and the rich subdifferential calculus for the distance function from Proposition 2.1, and this approach is novel even is the nondirectional setting. We start with the following observation, where we invoke definitions of $\Lambda^{0}(\bar{x};u)$ and $\Lambda^{0}(\bar{x})$ .

Lemma 3.1.

Let $\bar{x}$ be feasible for (1) such that MSCQ is violated at $\bar{x}$ . Then there exist sequences $\{x^{k}\notin\mathcal{X}\}\to\bar{x}$ and $\{\xi^{k}\in\partial\left({\rm d}_{\Gamma}\circ F\right)(x^{k})\}\to 0$ as well as $u\in\mathbb{R}^{n}\setminus\{0\}$ with $\|u\|=1$ such that

[TABLE]

Proof.

Violation of MSCQ at $\bar{x}$ readily yields a sequence $\{\tilde{x}^{k}\}\to\bar{x}$ with ${\rm d}_{\mathcal{X}}(\tilde{x}^{k})>k{\rm d}_{\Gamma}(F(\tilde{x}^{k}))$ . We put $\varepsilon_{k}:={\rm d}_{\Gamma}(F(\tilde{x}^{k}))$ and find that $\tilde{x}^{k}$ is an $\varepsilon_{k}$ -minimizer of ${\rm d}_{\Gamma}\circ F$ for all $k\in\mathbb{N}$ . Hence by Ekeland’s variational principle [56, Proposition 1.43] with $\delta=\frac{1}{k}\;(k\in\mathbb{N})$ , there exists a sequence $\{x^{k}\}$ such that $x^{k}=\mathop{{\rm argmin}}\left\{{\rm d}_{\Gamma}\circ F+\frac{1}{k}\|(\cdot)-x^{k}\|\right\}$ and $\|x^{k}-\tilde{x}^{k}\|\leq k\varepsilon_{k}<{\rm d}_{\mathcal{X}}(\tilde{x}^{k})$ for all $k\in\mathbb{N}$ . This implies $\{x^{k}\notin\mathcal{X}\}\to\bar{x}$ as well as $0\in\partial({\rm d}_{\Gamma}\circ F)(x^{k})+\frac{1}{k}\mathbb{B}$ for all $k\in\mathbb{N}$ by applying a nonsmooth Fermat’s rule (cf. [56, Theorem 10.1]) and invoking a sum rule for locally Lipschitz functions (cf. [56, Exercise 10.10]). In particular, there exists a sequence $\{\xi^{k}\in\partial({\rm d}_{\Gamma}\circ F)(x^{k})\}\to 0$ . As $x^{k}\neq\bar{x}$ , w.l.o.g. we may assume that $\frac{x^{k}-\bar{x}}{\|x^{k}-\bar{x}\|}\to u$ with $\|u\|=1$ . Now let $y^{k}\in P_{\Gamma}(F(x^{k}))$ for all $k\in\mathbb{N}$ . Then

[TABLE]

As $x^{k}$ minimizes ${\rm d}_{\Gamma}\circ F+\frac{1}{k}\|(\cdot)-x^{k}\|$ for all $k\in\mathbb{N}$ , we find that ${\rm d}_{\Gamma}(F(x^{k}))\leq 1/k\|\bar{x}-x^{k}\|$ . Hence we infer that the first term on the right in (7) satisfies

[TABLE]

The second term on the right in (7) goes to zero by differentiability of $F$ and we conclude from (7) that $\frac{y^{k}-F(\bar{x})}{\|x^{k}-\bar{x}\|}\to\nabla F(\bar{x})u$ . Finally, as $y^{k}\in\Gamma$ for all $k\in\mathbb{N}$ , we have $\nabla F(\bar{x})u\in T_{\Gamma}(F(\bar{x}))$ . ∎

Theorem 3.2.

Let $\bar{x}$ be feasible for (1) and assume that the following holds: For every $u\in\mathbb{R}^{n}$ with $\|u\|=1$ and $\nabla F(\bar{x})u\in T_{\Gamma}(F(\bar{x}))$ there does not exist a nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ that satisfies the following condition: There exists a sequence $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ such that for all $k\in\mathbb{N}$ we have

[TABLE]

Then MSCQ is fulfilled at $\bar{x}$ .

Proof.

Assume that MSCQ is not satisfied at $\bar{x}$ . Consider sequences $\{x^{k}\notin\mathcal{X}\}\to\bar{x}$ , $\{\xi^{k}\in\partial\left({\rm d}_{\Gamma}\circ F\right)(x^{k})\}\to 0$ and $u\in\mathbb{R}^{n}$ with $\|u\|=1$ provided by Lemma 3.1. Recall that

[TABLE]

see Proposition 2.1 (ii). Moreover, by Proposition 2.1 (i), it holds that

[TABLE]

since $x^{k}\notin\mathcal{X}\;(k\in\mathbb{N})$ . Consequently, there exists $\{y^{k}\in P_{\Gamma}(F(x^{k}))\}$ such that with

[TABLE]

we have

[TABLE]

Moreover, by the definition of $\lambda^{k}$ in (8) and the fact that $y^{k}\in P_{\Gamma}(F(x^{k}))\,(k\in\mathbb{N})$ , [56, Example 6.16] implies that

[TABLE]

Since $\{\lambda^{k}\}$ is bounded, we may assume w.l.o.g. that $\lambda^{k}\to\bar{\lambda}$ for some $\bar{\lambda}\neq 0$ . Then from (8) we infer that $y^{k}\to F(\bar{x})$ . Hence, passing to the limit in (9) we obtain

[TABLE]

Now, if $\bar{\lambda}_{i}>0$ then w.l.o.g. $F_{i}(x^{k})-y^{k}_{i}={\rm d}_{\Gamma}(F(x^{k}))\lambda_{i}^{k}>0$ and hence $\bar{\lambda}_{i}(F_{i}(x^{k})-y_{i}^{k})>0$ . Analogously, we argue for $\bar{\lambda}_{i}<0$ . Altogether, we find that

[TABLE]

Finally, Lemma 3.1 yields that $(y^{k}-F(\bar{x}))/\left\|x^{k}-\bar{x}\right\|\to\nabla F(\bar{x})u$ , showing $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}),\nabla F(\bar{x})u)$ , which establishes a contradiction. ∎

Instead of directly extracting directional versions of quasi- and pseudo-normality from Theorem 3.2, we introduce the notion of PQ-normality which serves as a bridge between pseudo- and quasi-normality, which are then identified as the two extreme cases of PQ-normality. We strongly emphasize that introducing PQ-normality does not merely serve the academic purpose of unifying the two concepts. In fact, it has important consequences for the class of programs in Section 5 where the set $\Gamma$ possesses an underlying product structure in addition to its disjunctive nature.

First, we introduce additional notation. For $z\in\mathbb{R}^{d}$ we denote by $z_{i}\;(i\in I:=\{1,\ldots,d\})$ its scalar components. More generally, suppose that $\mathbb{R}^{d}$ is expressed via $l(\leq d)$ factors as $\mathbb{R}^{d_{1}}\times\ldots\times\mathbb{R}^{d_{l}}$ and introduce the $d$ multi-indices $\delta:=(d_{1},\ldots,d_{l})\in\mathbb{N}^{l}$ with $|\delta|:=d_{1}+\ldots+d_{l}=d$ . Note that there is a one-to-one correspondence between such multi-indices and factorizations of $\mathbb{R}^{d}$ . The components of some $z\in\mathbb{R}^{d}$ we denote as $z_{\nu}$ for $\nu\in I_{\delta}$ , where $I_{\delta}$ is some (abstract) index set of $l$ elements. Note that we do not identify $I_{\delta}$ with $\{1,\ldots,l\}$ in order to avoid ambiguity of notation, e.g., $z_{1}\subset\mathbb{R}$ stands only for the first, scalar, component of $z$ . Moreover, we use a Greek letter to indicate the vector components $z_{\nu}$ of $z$ and a Latin letter to indicate the scalar components $z_{i}$ .

Given a multi-index $\delta$ fix $\nu\in I_{\delta}$ . The component $z_{\nu}$ , vector in general, can also be written via its scalar components, i.e., there exists an index set, denoted by $I^{\nu}$ , such that $z_{\nu}=(z_{i})_{i\in I^{\nu}}$ . Note that $\cup_{\nu\in I_{\delta}}I^{\nu}=I$ . Finally, given two multi-indices $\delta,\delta^{\prime}$ with $|\delta|=|\delta^{\prime}|=d$ , we say that $\delta^{\prime}$ is a refinement of $\delta$ and write $\delta^{\prime}\subset\delta$ , provided for every $\nu\in I_{\delta}$ there exists an index set $I^{\nu}_{\delta^{\prime}}$ such that

[TABLE]

Note that the special multi-indices $\delta^{P}:=d\in\mathbb{N}^{1}$ and $\delta^{Q}:=(1,\ldots,1)\in\mathbb{N}^{d}$ are in fact maximal and minimal in the sense that for any multi-index $\delta\in\mathbb{N}^{l}$ with $|\delta|=d$ one has $\delta^{Q}\subset\delta\subset\delta^{P}$ .

The following example illustrates the use of the above notation.

Example 3.3.

Let $d=7$ , $I:=\{1,\ldots,7\}$ and consider a multi-index $\delta:=(1,4,2)$ corresponding to the factorization $\mathbb{R}^{7}=\mathbb{R}\times\mathbb{R}^{4}\times\mathbb{R}^{2}$ . Consider also an element $z=(z_{1},\ldots,z_{7})\in\mathbb{R}^{7}$ . Since $\delta$ has three components, we may set, e.g., $I_{\delta}=\{a,b,c\}$ yielding $z_{a}=z_{1}$ , $z_{b}=(z_{2},z_{3},z_{4},z_{5})$ and $z_{c}=(z_{6},z_{7})$ . Clearly, we have $I^{a}=\{1\}$ , $I^{b}=\{2,3,4,5\}$ and $I^{c}=\{6,7\}$ .

Moreover, the multi-index $\delta^{\prime}:=(1,3,1,1,1)$ is a refinement of $\delta$ , since we may set

[TABLE]

to obtain

[TABLE]

We now proceed with the definition of PQ-normality which embeds quasi- and pseudo-normality as extremal cases in a whole family of constraint qualifications.

Definition 3.4 (PQ-normality).

Let $\bar{x}\in\mathcal{X}$ be feasible for (1), consider $u\in\mathbb{R}^{n}$ with $\|u\|=1$ , and let $\delta\in\mathbb{N}^{l}$ be a multi-index such that $|\delta|=d$ . We say that

(i)

PQ-normality w.r.t. $\delta$ * holds at $\bar{x}$ , if there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ that satisfies the following condition: There exists a sequence $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ with $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ and*

[TABLE]

(ii)

PQ-normality w.r.t. $\delta$ in direction $u$ * holds at $\bar{x}$ , if there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ that satisfies the following condition: There exists a sequence $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ with $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ , (10) and*

[TABLE]

We say that directional PQ-normality w.r.t. $\delta$ holds at $\bar{x}$ , if PQ-normality w.r.t. $\delta$ in direction $u$ holds at $\bar{x}$ for all $u\in\mathbb{R}^{n}$ with $\|u\|=1$ . In particular, we refer to PQ-normality w.r.t. $\delta^{P}$ (in direction $u$ ) as pseudo-normality (in direction $u$ ), while PQ-normality w.r.t. $\delta^{Q}$ we call quasi-normality.

It is clear from the definition that PQ-normality w.r.t. $\delta$ implies PQ-normality w.r.t. $\delta^{\prime}$ provided $\delta^{\prime}\subset\delta$ . In particular, since $\delta^{Q}\subset\delta\subset\delta^{P}$ for all $\delta\in\mathbb{N}^{l}$ with $|\delta|=d$ , we conclude that pseudo-normality implies PQ-normality w.r.t. any $\delta$ and this further implies quasi-normality. Naturally, all of the above comments remain true for the corresponding directional CQs.

For the sake of completeness, we reformulate Theorem 3.2 in terms of directional PQ-normality.

Theorem 3.5.

Let $\bar{x}$ be feasible for (1) and let the directional PQ-normality w.r.t. any $\delta\in\mathbb{N}^{l}$ , in particular directional pseudo- or quasi-normality, hold at $\bar{x}$ . Then MSCQ is fulfilled at $\bar{x}$ .

We point out that directional quasi-normality is strictly weaker than both FOSCMS (clear from the definition of the respective CQs) as well as quasi-normality, see Example 2.6. Hence it constitutes, to the best of our knowledge, one of the weakest conditions to imply MSCQ for the general optimization problem (1), which can still be efficiently verified in some very important cases as shown in Section 3.2 and Section 3.3 below.

The same directional versions of pseudo- and quasi-normality were independently introduced in a recent paper by Bai et al. [3]. In order to show that these imply MSCQ, Bai et al. build on results from [28, Corollary 1, Remarks 1 and 2]. Hence, we believe our alternative proof can provide additional insight on the role of pseudo- and quasi-normality in verifying MSCQ. More importantly, in what follows, we focus on simplifying these conditions under specific structural assumptions on the feasible set, which is a crucial step to facilitate their use.

3.2 Simplified CQs and second-order sufficient conditions: The standard case

For some important instances of the general program (1), the concepts of pseudo- and quasi-normality were introduced without the undesirable additional sequence $\{y^{k}\}$ , see [10] for standard NLPs and [45] for MPCCs. In the remaining part of this section, we address the question as to when this is possible for more general instances of (1), working with the generalized notion of PQ-normality. In turn, in the remainder of this section, $\delta$ denotes a multi-index in $\mathbb{N}^{l}$ for some $l\in\{1,\ldots,d\}$ with $|\delta|=d$ unless stated otherwise. As a result of dropping the sequence $\{y^{k}\}$ , we obtain a characterization of PQ-normality via an extremal condition, which, in turn, yields several sufficient conditions for PQ-normality.

For clarity of exposition, we split our analysis into the standard (non-directional) and the directional case.

We begin our study of the non-directional case by the following straightforward result, which follows readily from definition of PQ-normality using the sequences $y^{k}:=F(\bar{x})$ and $\lambda^{k}:=\bar{\lambda}$ , taking also into account Lemma 2.4 and the arguments in the proof of Corollary 2.5.

Lemma 3.6.

Let $\bar{x}$ be feasible for (1). If PQ-normality w.r.t. $\delta$ holds at $\bar{x}$ then there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ that satisfies the following condition: There exists a sequence $\{x^{k}\}\to\bar{x}$ with

[TABLE]

Note that in case of MPCCs, by the geometry of the feasible set and the resulting normal cones, one always has $\left\langle\bar{\lambda},F({\bar{x}})\right\rangle=0$ . Thus the conditions used in [45] simplify to $\left\langle\bar{\lambda},F(x^{k})\right\rangle>0$ and $\bar{\lambda}_{i}F_{i}(x^{k})>0$ if $\bar{\lambda}_{i}\neq 0$ , respectively. However, in the general setting of problem (1), as well as in the case of general disjunctive constraints, we cannot make this simplification. In order to obtain the reverse implication, however, we have to impose some additional assumptions on the constraints of (1).

Assumption 3.7.

Let $\delta$ be a multi-index and let $\bar{x}$ be feasible for (1). Assume that for every $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ and every sequence $\{(y^{k},\lambda^{k})\in\Gamma\times\mathbb{R}^{d}\}\to(F(\bar{x}),\bar{\lambda})$ with $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ , there exists a subsequence $K\subset\mathbb{N}$ such that

[TABLE]

Theorem 3.8 (Simplified PQ-normality under Ass. 3.7).

Let $\bar{x}$ be feasible for (1) and $\delta$ such that Assumption 3.7 holds. Then PQ-normality w.r.t. $\delta$ at $\bar{x}$ is equivalent to the following simplified PQ-normality w.r.t. $\delta$ at $\bar{x}$ , i.e.

There is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ such that there exists a sequence $\{x^{k}\}\to\bar{x}$ fulfilling (12).

Proof.

The fact that PQ-normality implies the simplified PQ-normality follows from Lemma 3.6.

In turn, if PQ-normality w.r.t. $\delta$ is violated, there exist $\bar{\lambda}\in\Lambda^{0}(\bar{x})\setminus\{0\}$ and $\{(x^{k},y^{k},\lambda^{k})\in\mathbb{R}^{n}\times\Gamma\times\mathbb{R}^{d}\}\to(\bar{x},F(\bar{x}),\bar{\lambda})$ with $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ and $\left\langle\bar{\lambda}_{\nu},F_{\nu}(x^{k})-y_{\nu}^{k}\right\rangle>0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ . Relabeling $\{x^{k}\}$ by only using the indices $k\in K$ and then summing up the above expression with (13) for all $k\in K$ shows that the simplified PQ-normality is then violated as well. ∎

As the above theorem shows, under Assumption 3.7, the simplified PQ-normality is equivalent to PQ-normality, hence sufficient for MSCQ. Without Assumption 3.7 this is, in general, false, see Example 3.13. In the following sections, however, we deal with various types of optimization problems which automatically satisfy Assumption 3.7 for a suitable multi-index, including $\delta^{P}$ and $\delta^{Q}$ , at every feasible point.

As we will now show, Theorem 3.8 also reveals an interesting connection between PQ-normality and vector optimization. This, in turn, paves the way to a variety of sufficient conditions for PQ-normality, hence also for MSCQ.

Let us recall some standard terminology from multiobjective optimization [22, 44]. Given $\varphi:\mathbb{R}^{n}\to\mathbb{R}^{q}$ , a point $\bar{x}$ is called a local weak efficient solution of the unconstrained vector optimization problem $\max_{x\in\mathbb{R}^{n}}\varphi(x)$ if there exists a neighborhood $U$ of $\bar{x}$ such that no $x\in U$ satisfies $\varphi_{j}(x)>\varphi_{j}(\bar{x})$ for all $j=1,\ldots,q$ . Given $\delta=(d_{1},\ldots,d_{l})\in\mathbb{N}^{l}$ and $\lambda=(\lambda_{\nu})_{\nu\in I_{\delta}}\in\mathbb{R}^{d_{1}}\times\ldots\times\mathbb{R}^{d_{l}}=\mathbb{R}^{d}$ , we define the function

[TABLE]

Theorem 3.9.

Let $\bar{x}$ be feasible for (1) and let Assumption 3.7 for some $\delta$ be fulfilled. Then PQ-normality w.r.t. $\delta$ holds at ${\bar{x}}$ if and only if for every $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , the vector $\bar{x}$ is a local weak efficient solution of the unconstrained vector optimization problem $\max_{x\in\mathbb{R}^{n}}\varphi^{\bar{\lambda}}(x)$ for $\varphi^{\bar{\lambda}}$ given by (14).

Proof.

If there exists $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ such that $\bar{x}$ is not a local weak efficient solution of $\max_{x\in\mathbb{R}^{n}}\varphi^{\bar{\lambda}}(x)$ , then $\bar{\lambda}\neq 0$ and there exists $\{x^{k}\}\to\bar{x}$ such that $\left\langle\bar{\lambda}_{\nu},F_{\nu}(x^{k})\right\rangle>\left\langle\bar{\lambda}_{\nu},F_{\nu}(\bar{x})\right\rangle$ for all $\nu\in I_{\delta}(\bar{\lambda})$ and all $k\in\mathbb{N}$ . This shows that PQ-normality w.r.t. $\delta$ is violated due to Theorem 3.8.

In turn, if PQ-normality w.r.t. $\delta$ is violated, there exists $\bar{\lambda}\in\Lambda^{0}(\bar{x})\setminus\{0\}$ and a sequence $\{x^{k}\}\to\bar{x}$ such that $\left\langle\bar{\lambda}_{\nu},F_{\nu}(x^{k})-F_{\nu}(\bar{x})\right\rangle>0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ and all $k\in\mathbb{N}$ , which shows that $\bar{x}$ is not a local weak efficient solution of $\max_{x\in\mathbb{R}^{n}}\varphi^{\bar{\lambda}}(x)$ . ∎

This simple observation has some significant consequences. In particular, it allows us to use the standard sufficient conditions for a local weak efficient solution to obtain the following point-based sufficient condition for PQ-normality.

Corollary 3.10 (Sufficient condition for PQ-normality).

Let $\bar{x}$ be feasible for (1) with $F$ twice differentiable at $\bar{x}$ and let Assumption 3.7 for some $\delta$ be fulfilled. Then PQ-normality w.r.t. $\delta$ , in particular MSCQ, holds at ${\bar{x}}$ under the following condition: For every $\bar{\lambda}\in\Lambda^{0}(\bar{x})\setminus\{0\}$ , every $u\in\mathbb{R}^{n}\setminus\{0\}$ with $\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})u\right\rangle=0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ and every $w$ with $\left\langle w,u\right\rangle=0$ one has

[TABLE]

Proof.

Consider $\bar{\lambda}\in\Lambda^{0}(\bar{x})\setminus\{0\}$ and $\varphi^{\bar{\lambda}}$ given by (14) and let $z\in\mathbb{R}^{n}$ be arbitrary. Then

[TABLE]

since $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ . Hence, every $u$ with $\nabla\varphi^{\bar{\lambda}}_{\nu}(\bar{x})u\geq 0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ in fact fulfills $\nabla\varphi^{\bar{\lambda}}_{\nu}(\bar{x})u=\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})u\right\rangle=0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ . The result thus follows from [11, Theorem 4] and Theorem 3.9. ∎

Remark 3.11.

Note that $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ implies the first-order necessary conditions for local efficient solution, $\min_{\nu\in I_{\delta}(\bar{\lambda})}\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})w\right\rangle\leq 0$ for all $w\in\mathbb{R}^{n}$ , as can be seen from (16).

The above corollary motivates the following definition.

Definition 3.12.

Given a feasible point $\bar{x}$ for (1) and a multi-index $\delta$ , we say that the second-order sufficient condition for PQ-normality w.r.t. $\delta$ , SOSCPQN( $\delta$ ) for short, holds at $\bar{x}$ provided: For every $\bar{\lambda}\in\Lambda^{0}(\bar{x})\setminus\{0\}$ , every $u\in\mathbb{R}^{n}\setminus\{0\}$ with $\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})u\right\rangle=0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ and every $w$ with $\left\langle w,u\right\rangle=0$ one has (15).

Moreover, we refer to SOSCPQN( $\delta^{P}$ ) and SOSCPQN( $\delta^{Q}$ ) as second-order sufficient condition for pseudo-/quasi-normality (SOSCPN and SOSCQN), respectively.

Naturally, one can also consider higher-order sufficient conditions. We do so in Section 4, where we focus on pseudo-normality. Note that pseudo-normality is connected to standard maximality since $\varphi^{\lambda}$ is a scalar function in that case.

The following example shows that SOSCPN on its own, i.e., without Assumption 3.7 for $\delta^{P}$ , does not guarantee pseudo-normality, not even MSCQ.

Example 3.13.

Consider $\Gamma\subset\mathbb{R}^{2}$ given by $\Gamma:=\{y\in\mathbb{R}^{2}\,|\,y_{2}\geq|y_{1}|^{3/2}\}$ and $F:\mathbb{R}\to\mathbb{R}^{2}$ defined by $F(x):=(x,x^{2})^{T}$ and let $\bar{x}:=0$ . Clearly $\nabla F(\bar{x})=(1,0)^{T}$ and $\Lambda^{0}(\bar{x})=\mathbb{R}_{+}(0,-1)^{T}.$ Thus, for every $\lambda\in\Lambda^{0}(\bar{x})\setminus\{0\}$ and every $u\in\mathbb{R}\setminus\{0\}$ we have $u^{T}\nabla^{2}\left\langle\lambda,F\right\rangle(\bar{x})u=-2\alpha u^{2}<0$ , where $\alpha>0$ is such that $\lambda=(0,-\alpha)$ , showing that SOSCPN holds at $\bar{x}$ . On the other hand, for a sequence $\{x_{k}\}\to 0$ we obtain ${\rm d}_{F^{-1}(\Gamma)}(x_{k})=|x_{k}|$ , while

[TABLE]

showing the violation of MSCQ and consequently of pseudo-normality as well.

We point out that the set $\Gamma$ in Example 3.13 equals $\mathrm{epi}\,|\cdot|^{3/2}$ and is therefore convex, yet SOSCPN still does not imply MSCQ.

Theorem 3.14.

Let $\bar{x}$ be feasible for (1) with $F$ twice differentiable at $\bar{x}$ and consider two multi-indices $\delta\in\mathbb{N}^{l},\delta^{\prime}\in\mathbb{N}^{l^{\prime}}$ with $\delta^{\prime}\subset\delta$ . Then SOSCPQN( $\delta$ ) implies SOSCPQN( $\delta^{\prime}$ ). In particular, we have SOSCPN $\Rightarrow$ SOSCPQN( $\delta$ ) $\Rightarrow$ SOSCQN.

Proof.

Consider $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , $0\neq u\in\mathbb{R}^{n}$ and $w\in\mathbb{R}^{n}$ with $\left\langle\bar{\lambda}_{\nu^{\prime}},\nabla F_{\nu^{\prime}}(\bar{x})u\right\rangle=0$ for all $\nu^{\prime}\in I_{\delta^{\prime}}(\bar{\lambda})$ and $\left\langle w,u\right\rangle=0$ . For any $\nu\in I_{\delta}(\bar{\lambda})$ , we find some index set $I^{\nu}_{\delta^{\prime}}\subset I_{\delta^{\prime}}$ such that $z_{\nu}=(z_{\nu^{\prime}})_{\nu^{\prime}\in I^{\nu}_{\delta^{\prime}}}$ by $\delta^{\prime}\subset\delta$ . Summing up $\left\langle\bar{\lambda}_{\nu^{\prime}},\nabla F_{\nu^{\prime}}(\bar{x})u\right\rangle=0$ over $I^{\nu}_{\delta^{\prime}}$ yields $\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})u\right\rangle=0$ . Thus, we can apply SOSCPQN( $\delta$ ) in order to infer the existence of $\bar{\nu}\in I_{\delta}(\bar{\lambda})$ such that

[TABLE]

This yields, however, that SOSCPQN( $\delta^{\prime}$ ) is fulfilled.

The second statement now follows from the obvious relation $\delta^{Q}\subset\delta\subset\delta^{P}$ valid for any $\delta$ . ∎

The above theorem holds regardless of Assumption 3.7. If one seeks to use any of the sufficient conditions to get metric subregularity, however, one clearly needs it, see Examples 3.13 and 3.16. Note also that if Assumption 3.7 holds for $\delta^{\prime}$ , then it also holds for any $\delta\supset\delta^{\prime}$ .

The following example shows that SOSCQN is, in fact, strictly milder than SOSCPN. Moreover, it demonstrates that one can effectively verify MSCQ by means of SOSCQN even when pseudo-normality is not fulfilled.

Example 3.15.

Let $\Gamma:=\Gamma_{1}\times\Gamma_{2}\subset\mathbb{R}^{2}$ for two convex polyhedral sets $\Gamma_{1}=\Gamma_{2}:=\mathbb{R}_{-}$ and let $F:=(F_{1},F_{2})^{T}:\mathbb{R}\to\mathbb{R}^{2}$ for $F_{1}(x):=-x$ and $F_{2}(x):=x+x^{2}$ and let $\bar{x}:=0$ . In particular, Assumption 3.7 for $\delta^{Q}$ is fulfilled by Corollary 5.1. Clearly, $\nabla F_{1}(\bar{x})=-1$ , $\nabla F_{2}(\bar{x})=1$ and hence $\Lambda^{0}(\bar{x})=\mathbb{R}_{+}(1,1)^{T}.$

SOSCQN is fulfilled since for any $\lambda=(\lambda_{1},\lambda_{2})=\alpha(1,1)^{T}$ for some $\alpha>0$ and for $u=\pm 1$ one has $|\lambda_{i}\nabla F_{i}(\bar{x})u|=\alpha\neq 0$ , $i=1,2$ . In particular, quasi-normality and MSCQ follows.

On the other hand, let $\bar{\lambda}:=(1,1)^{T}$ and consider a sequence $\{x_{k}\}\downarrow 0$ . We obtain

[TABLE]

showing the violation of pseudo-normality.

The next example shows that, without Assumption 3.7 for $\delta^{Q}$ , the simplified form of quasi-normality from Lemma 3.6 does not imply MSCQ even if $\Gamma$ is a convex polyhedral set.

Example 3.16.

Let $\Gamma\subset\mathbb{R}^{2}$ be convex polyhedral set given by $\Gamma:=\{y\in\mathbb{R}^{2}\,|\,y_{2}\geq y_{1}\}$ and $F:\mathbb{R}\to\mathbb{R}^{2}$ given by $F(x):=(x,\sin x)^{T}$ and let $\bar{x}:=0$ . Clearly $\nabla F(\bar{x})=(1,1)^{T}$ and we find that $\Lambda^{0}(\bar{x})=\mathbb{R}_{+}(1,-1)^{T}.$ For every $\lambda=(\lambda_{1},\lambda_{2})=\alpha(1,-1)^{T}$ for some $\alpha>0$ and every $x\in\mathbb{R}$ close to $\bar{x}$ we have $\lambda_{1}(F_{1}(x)-F_{1}(\bar{x}))=\alpha x<0$ if $x<0$ and $\lambda_{2}(F_{2}(x)-F_{1}(\bar{x}))=-\alpha\sin x\leq 0$ if $x\geq 0$ , showing that the simplified form of quasi-normality holds at $\bar{x}$ . On the other hand, for a sequence $\{x_{k}\}\downarrow 0$ we obtain ${\rm d}_{F^{-1}(\Gamma)}(x_{k})=|x_{k}|$ , while

[TABLE]

showing the violation of MSCQ.

3.3 Simplified CQs and second-order sufficient conditions: The directional case

In this subsection, we consider the directional case, where the situation is slightly different.

Theorem 3.17.

Let $\bar{x}$ be feasible for (1) and consider $u\in\mathbb{R}^{n}$ with $\|u\|=1$ . Then under Assumption 3.7 for $\delta$ , PQ-normality w.r.t. $\delta$ at $\bar{x}$ in direction $u$ holds if: there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ such that there exists a sequence $\{x^{k}\}\to\bar{x}$ with $(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|\to u$ fulfilling (12).

Proof.

The proof follows by the same arguments as used in the proof of Theorem 3.8. ∎

In contrast to the standard case, the following example shows that the reverse implication in the above theorem is not true in general.

Example 3.18.

Consider $\Gamma\subset\mathbb{R}^{2}$ given by $\Gamma:=\{y\in\mathbb{R}^{2}\,|\,y_{2}\leq y_{1}^{2}\}$ and $F:\mathbb{R}\to\mathbb{R}^{2}$ defined by $F(x):=(x,x^{4})^{T}$ and let $\bar{x}:=0$ and $u:=1$ . Clearly $\nabla F(\bar{x})=(1,0)^{T}$ and $\Lambda^{0}(\bar{x};1)=\Lambda^{0}(\bar{x})=\mathbb{R}_{+}(0,1)^{T}.$ Set $\bar{\lambda}:=(0,1)^{T}$ and note that any sequence $\{x_{k}\}\downarrow 0$ fulfills $(x_{k}-\bar{x})/\left\|x_{k}-\bar{x}\right\|\to u$ as well as (12) for $\delta^{P}$ , since $\left\langle\bar{\lambda},F(x_{k})-F(\bar{x})\right\rangle=x_{k}^{4}>0$ .

On the other hand, for arbitrary sequence $y^{k}=(y^{k}_{1},y^{k}_{2})^{T}\to F(\bar{x})=(0,0)^{T}$ with $N_{\Gamma}(y^{k})\neq\{0\}$ we have $y^{k}=(y_{1}^{k},(y_{1}^{k})^{2})^{T}$ . Hence, for any $\lambda\in\mathbb{R}_{+}(0,1)^{T}$ one has $\left\langle\lambda,y^{k}-F({\bar{x}})\right\rangle=\lambda_{2}(y_{1}^{k})^{2}\geq 0$ , showing that Assumption 3.7 for $\delta^{P}$ is fulfilled. Moreover $(y^{k}_{1}/x_{k},(y^{k}_{1})^{2}/x_{k})^{T}=(y^{k}-F(\bar{x}))/\left\|x_{k}-\bar{x}\right\|\to\nabla F(\bar{x})u=(1,0)^{T}$ yields $y^{k}_{1}/x_{k}\to 1$ . Then, however, we obtain

[TABLE]

showing that pseudo-normality at $\bar{x}$ in direction $u$ is fulfilled.

Nevertheless, the previous theorem still allows us to use sufficient conditions. Consider the following second-order sufficient condition for directional PQ-normality w.r.t. $\delta$ , SOSCdirPQN( $\delta$ ) for short.

Proposition 3.19 (SOSCdirPQN( $\delta$ )).

Let $\bar{x}$ be feasible for (1) with $F$ twice differentiable at $\bar{x}$ and let Assumption 3.7 for some $\delta$ be fulfilled. Then directional PQ-normality w.r.t. $\delta$ , in particular MSCQ, holds at ${\bar{x}}$ if the following SOSCdirPQN( $\delta$ ) is fulfilled: For every $u\in\mathbb{R}^{n}$ with $\|u\|=1$ , every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ with $\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})u\right\rangle=0$ , for all $\nu\in I_{\delta}(\bar{\lambda})$ and every $w$ with $\left\langle w,u\right\rangle=0$ condition (15) is fulfilled.

Proof.

Assume that directional PQ-normality w.r.t. $\delta$ is violated. Theorem 3.17 yields the existence of $u\in\mathbb{R}^{n}$ , $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ and a sequence $\{x^{k}\}\to\bar{x}$ with $(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|\to u$ such that $\varphi^{\bar{\lambda}}_{\nu}(x^{k})-\varphi^{\bar{\lambda}}_{\nu}(\bar{x})>0$ for all $\nu\in I_{\delta}(\bar{\lambda})$ with $\varphi^{\bar{\lambda}}$ as in (14). Hence, by passing to a subsequence if necessary, we can assume that $(\varphi(x^{k})-\varphi(\bar{x}))/\left\|\varphi(x^{k})-\varphi(\bar{x})\right\|\to p$ with $p\geq 0$ and $\left\|p\right\|=1$ , where for simplification we dropped the upper index $\bar{\lambda}$ from $\varphi$ .

By Taylor expansion, we have

[TABLE]

where $u^{T}\nabla^{2}\varphi(\bar{x})u$ denotes the vector in $\mathbb{R}^{|I_{\delta}(\bar{\lambda})|}$ with components $u^{T}\nabla^{2}\varphi^{\bar{\lambda}}_{\nu}(\bar{x})u$ for $\nu\in I_{\delta}(\bar{\lambda})$ . If there exists a subsequence $K$ such that $\left\|\varphi(x^{k})-\varphi(\bar{x})\right\|/\left\|x^{k}-\bar{x}\right\|^{2}\to\infty$ , we conclude from (17) that

[TABLE]

where $q^{k}\to 0$ for $k\in K$ . Passing to a subsequence if necessary, and taking into account that $\nabla\varphi(\bar{x})(x^{k}-\bar{x})/\left\|\varphi(x^{k})-\varphi(\bar{x})\right\|\in\operatorname{Im}(\nabla\varphi(\bar{x}))$ with $\operatorname{Im}(\nabla\varphi(\bar{x}))$ being a closed set, we conclude that $p\in\operatorname{Im}(\nabla\varphi(\bar{x}))$ , i.e., there exists $z\in\mathbb{R}^{n}$ with $p=\nabla\varphi(\bar{x})z=(\left\langle\bar{\lambda}_{\nu},\nabla F_{\nu}(\bar{x})z\right\rangle)_{\nu\in I_{\delta}(\bar{\lambda})}$ . This is, however, a contradiction with $\left\|p\right\|=1$ , since we obtain that $p=0$ by $p\geq 0$ and (16), which clearly holds due to $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)\subset\Lambda^{0}(\bar{x})$ .

Consequently, $\left\|\varphi(x^{k})-\varphi(\bar{x})\right\|/\left\|x^{k}-\bar{x}\right\|^{2}$ remains bounded and by passing to a subsequence $K$ if necessary we assume that $\left\|\varphi(x^{k})-\varphi(\bar{x})\right\|/\left\|x^{k}-\bar{x}\right\|^{2}\to\alpha\geq 0$ . Note also that in this case we get $\nabla\varphi(\bar{x})u=0$ by (17). By similar arguments as before, (17) now yields the existence of $w$ such that

[TABLE]

Moreover, we can clearly take $w$ with $\left\langle w,u\right\rangle=0$ since $\mathbb{R}^{n}$ is the direct sum of the span of $u$ and its orthogonal complement and $\nabla\varphi(\bar{x})u=0$ . The assumed SSOSCdirPQN( $\delta$ ) (15) implies the existence of $\nu\in I_{\delta}(\bar{\lambda})$ with $\alpha p_{\nu}<0$ , a contradiction. This completes the proof. ∎

Remark 3.20.

Note that $\Lambda^{0}(\bar{x};u)\neq\emptyset$ includes the condition $\nabla F(\bar{x})u\in T_{\Gamma}(F(\bar{x}))$ .

As before, we will refer to SOSCdirPQN( $\delta^{P}$ ) and SOSCdirPQN( $\delta^{Q}$ ) as second-order sufficient condition for directional pseudo/quasi-normality (SOSCdirPN and SOSCdirQN).

The following directional counterpart of Theorem 3.14 follows by the same arguments.

Theorem 3.21.

Let $\bar{x}$ be feasible for (1) with $F$ twice differentiable at $\bar{x}$ and consider two multi-indices $\delta\in\mathbb{N}^{l},\delta^{\prime}\in\mathbb{N}^{l^{\prime}}$ with $\delta^{\prime}\subset\delta$ . Then SOSCdirPQN( $\delta$ ) implies SOSCdirPQN( $\delta^{\prime}$ ). In particular, we have SOSCdirPN $\Rightarrow$ SOSCdirPQN( $\delta$ ) $\Rightarrow$ SOSCdirQN.

We point out here that, unlike in the non-directional case, we could not find an example to show that the above implications can be indeed strict, so this remains an open question.

3.4 Summary

We now summarize our findings of this section. We studied the directional versions of pseudo- and quasi-normality, first established in the paper by Bai et al. [3]. In addition, we introduced the new concept of PQ-normality, together with its directional counterpart, that unifies the two standard CQs. As a result, we obtained novel and improved results for the metric subregularity constraint qualification and we established interesting connections among the well-known CQs and the new ones.

In the following diagram, we summarize the relations between the various constraint qualifications weaker than GMFCQ that imply MSCQ. The point-based conditions are naturally of primary interest and are hence emphasized in double-framed boxes. Note that pseudo- and quasi-normality are included as special cases of PQ-normality for $\delta^{P}$ and $\delta^{Q}$ .

4 Programs with disjunctive constraints

In this section we study a special case of problem (1) in which the set $\Gamma$ is disjunctive, that is it can be written as a union of finitely many convex polyhedra, i.e.,

[TABLE]

where we refer the reader to Section 4.1 for a definition of convex polyhedral sets. Subsequently, we call problem (1) with $\Gamma$ disjunctive (in the sense of (18)) as a (mathematical) program with disjunctive constraints or simply a disjunctive program.

Disjunctive programs have been systematically studied for decades, see, e.g., [58] and the references therein. For more recent works on disjunctive programs, which are also more related to our approach, we refer to the papers [5, 24, 27, 50] and the thesis [6].

The most prominent examples of disjunctive programs are the aforementioned classes of MPCCs, MPVCs, as well as mathematical programs with relaxed cardinality constraints (MPrCCs), mathematical programs with relaxed probabilistic constraints (MPrPCs), and the recently introduced mathematical programs with switching constraints (MPSCs).

For the mathematical background and several applications we refer the reader to the textbooks [49, 53] for MPCCs as well as to the book [19] on the closely related class of bilevel programs. As for MPVCs we refer to the paper [1] and the thesis [37] and the references therein. For relaxed cardinality constrained problems we point to the papers [14, 17]. For MPrPCs see [2], and for MPSCs see [51].

Dropping standard constraints for brevity, all of these programs exhibit the general form

[TABLE]

where $f,G_{i},H_{i}:\mathbb{R}\to\mathbb{R}$ are continuously differentiable, $V$ is a finite index set and $\widetilde{\Gamma}$ is given by

(a)

(complementarity constraints)

[TABLE]

(b)

(vanishing constraints)

[TABLE]

(c)

(relaxed cardinality constraints)

[TABLE]

(d)

(relaxed probabilistic constraints)

[TABLE]

(e)

(switching constraints)

[TABLE]

Clearly, $\Gamma_{\text{CC}}$ , $\Gamma_{\text{VC}}$ , $\Gamma_{\text{rCC}}$ , $\Gamma_{\text{rPC}}$ and $\Gamma_{\text{SC}}$ are disjunctive, rendering the resulting optimization problem a disjunctive program. We point out that there is generally not a unique way to write the disjunctive sets in (a)-(e) as a union of convex polyhedral sets. For instance, $\Gamma_{\text{VC}}$ can be alternatively written as $\Gamma_{\text{VC}}=(\mathbb{R}_{-}\times\mathbb{R}_{+})\cup(\mathbb{R}\times\{0\})$ .

The main finding of this section is to show that the crucial Assumption 3.7 is automatically fulfilled for disjunctive programs. In addition, we also prove that directional pseudo-normality does not only imply, but is, in fact, equivalent to its simplified form from Theorem 3.17, which suggests that our sufficient conditions are not too restrictive. Recall that Example 3.18 shows that, in general, the simplified form is strictly stronger. For these purposes, we commence our study with a preliminary section on the variational geometry of convex polyhedral sets and how these extend to a more general setting.

4.1 Key properties of convex polyhedral sets

Recall that a set is said to be convex polyhedral (or a convex polyhedron) if it is the intersection of finitely many closed half-spaces. In particular, for a convex polyhedron $P\subset\mathbb{R}^{s}$ there exist $p\in\mathbb{N}$ and $a_{j}\in\mathbb{R}^{s},\;\beta_{j}\in\mathbb{R}\;(j=1,\dots,p)$ such that

[TABLE]

Clearly, every convex polyhedron is closed. Due to convexity of $P$ , the regular and limiting normal cone to $P$ coincide with the classical normal cone of convex analysis, see (3). Given $y\in P$ , we have

[TABLE]

where $J(y):=\left\{j\in\{1,\dots,p\}\,\left|\;\left\langle a_{j},y\right\rangle=\beta_{j}\right.\right\}$ , i.e., the normal cone of $P$ at $y$ is the convex cone generated by $\left\{a_{j}\,\left|\;j\in J(y)\right.\right\}$ , see e.g. [36, p. 67]. Therefore, there is only a finite number of different normal cones induced by a convex polyhedral set, in fact, this number is bounded by $2^{p}$ (as there can be at most $2^{p}$ active sets in $\{1,\dots,p\}$ ).

We will make use of two essential properties of convex polyhedra. The first one is the well-known exactness of tangent approximation, see [56, Exercise 6.47]: Given a convex polyhedron $P$ , for any $\bar{y}\in P$ there exists a neighborhood $U$ of $\bar{y}$ such that

[TABLE]

In particular, taking into account [56, Exercise 6.44], one has

[TABLE]

The second property is closely related to Assumption 3.7 as stated in the following lemma.

Lemma 4.1.

Let $P\subset\mathbb{R}^{s}$ be closed and convex, let $\{y^{k}\in P\}\to\bar{y}$ and $\{\lambda^{k}\in N_{P}(y^{k})\}\to\bar{\lambda}$ . Then there exists a subsequence $K\subset\mathbb{N}$ such that the following hold:

(i)

We have $\left\langle\bar{\lambda},y^{k}-\bar{y}\right\rangle\leq 0$ for all $k\in K$ ;

(ii)

Moreover, if $P$ is polyhedral then $\left\langle\bar{\lambda},y^{k}-\bar{y}\right\rangle=0$ for all $k\in K$ .

Proof.

(i) Taking the limit in $\lambda^{k}\in N_{P}(y^{k})$ yields $\bar{\lambda}\in N_{P}(\bar{y})$ . In particular, as $y^{k}\in P$ we get $\left\langle\bar{\lambda},y^{k}-\bar{y}\right\rangle\leq 0\;(k\in\mathbb{N})$ .

(ii) Recall from the discussion above, that for a convex polyhedral set there are only finitely many different normal cones. Hence, there exists a subsequence $K\subset\mathbb{N}$ such that $N_{P}(y^{k})\equiv\mathcal{N}$ for all $k\in K$ and some closed convex cone $\mathcal{N}$ . Consequently, from $\lambda^{k}\in N_{P}(y^{k})$ we obtain $\bar{\lambda}\in\mathcal{N}=N_{P}(y^{k})$ and hence $\left\langle\bar{\lambda},y^{k}-\bar{y}\right\rangle\geq 0$ due to convexity of $P$ and $\bar{y}\in P$ . ∎

The above lemma immediately yields that Assumption 3.7 for the multi-index $\delta^{P}:=d$ is fulfilled at every feasible point for program (1) with convex polyhedral $\Gamma$ , regardless of the constraint mapping $F$ . However, since we are not primarily interested in this convex polyhedral setting, we now state the desirable properties from (20) and Lemma 4.1 (ii) in a general form. To this end, given an arbitrary closed set $C\subset\mathbb{R}^{d}$ and $\bar{y}\in C$ , consider the following condition:

[TABLE]

where $U(\bar{y})$ denotes a neighborhood of $\bar{y}$ . Moreover, given also a multi-index $\delta\in\mathbb{N}^{l}$ with $|\delta|=d$ and $\bar{\lambda}\in\mathbb{R}^{d}$ , consider the condition:

[TABLE]

where $K$ is a subsequence of $\mathbb{N}$ . Note that (P2) is automatically fulfilled if $\bar{\lambda}\notin N_{C}(\bar{y})$ . We will repeatedly refer to these conditions in the subsequent study and hence we formulated it for an arbitrary multi-index $\delta$ . Clearly, if $\bar{x}$ is feasible for (1) and $\Gamma$ satisfies (P2) for $\delta$ , $\bar{y}=F(\bar{x})$ and every multiplier $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}))$ , then Assumption 3.7 for $\delta$ is fulfilled at $\bar{x}$ .

Motivated by the disjunctive setting in (18), for the remainder of our study we deal with sets generated by unions and, in addition, Cartesian products of convex polyhedra (see the product setting in Section 5). Hence, we now examine properties (P1) and (P2) under these set operations on convex polyhedra.

Consider first a collection of closed sets $C^{i}\subset\mathbb{R}^{d}$ for $i=1,\ldots,q$ and set $C:=\bigcup_{i=1}^{q}C^{i}$ . We start with some elementary observations about tangent and normal cones. To this end, for $y\in C$ , let us denote $\mathcal{I}(y):=\left\{i\in\{1,\ldots,q\}\,\left|\;y\in C^{i}\right.\right\}$ and observe that, by the definition of the tangent cone, we have

[TABLE]

hence, by polarization

[TABLE]

This yields the following elementary estimate

[TABLE]

which can be derived, e.g., from the more general result [7, Proposition 3.1].

On the other hand, consider now $C=\prod_{i=1}^{r}C_{i}$ , where $C_{i}\subset\mathbb{R}^{d_{i}}$ is closed for $i=1,\ldots,r$ and let $y=(y_{1},\ldots,y_{r})\in C$ . By [56, Proposition 6.41], we have

[TABLE]

Note that for the tangent cones, [56, Proposition 6.41] in general yields only the inclusion $T_{C}(y)\subset\prod_{i=1}^{r}T_{C_{i}}(y_{i})$ . It can be easily seen, however, that

[TABLE]

holds, provided $C_{i}$ satisfies (P1) at $\bar{y}_{i}$ for all $i=1,\ldots,r$ . Indeed, for $v=(v_{i})\in\prod_{i=1}^{r}T_{C_{i}}(y_{i})$ we readily obtain from (P1) for every $i=1,\ldots,r$ the existence of $\alpha_{i}>0$ such that $y_{i}+\alpha v_{i}\in C_{i}$ holds for all $\alpha\leq\alpha_{i}$ . Taking $\bar{\alpha}:=\min\alpha_{i}$ yields $y+\alpha v\in C$ for all $\alpha\leq\bar{\alpha}$ and $v\in T_{C}(y)$ follows.

Next we show that conditions (P1) and (P2) are preserved under unions and products, provided the obvious adjustments of multi-index, point and multiplier are made if needed.

Proposition 4.2.

Let $C=\bigcup_{i=1}^{q}C^{i}$ with $C^{i}\subset\mathbb{R}^{d}\;(i=1,\dots,q)$ closed and let $\bar{y}\in C$ .

(i)

If $C^{i}$ satisfies (P1) at $\bar{y}$ for all $i\in\mathcal{I}(\bar{y})$ , then $C$ also satisfies (P1) at $\bar{y}$ .

(ii)

If $C^{i}$ satisfies (P2) for some multi-index $\delta$ , the point $\bar{y}$ and some $\bar{\lambda}$ for all $i\in\mathcal{I}(\bar{y})$ , then $C$ also satisfies (P2) for $\delta$ , $\bar{y}$ and $\bar{\lambda}$ .

Proof.

Denoting $U^{i}(\bar{y})$ for $i\in\mathcal{I}(\bar{y})$ the neighborhoods given by the assumption (i) and taking into account (21), the first statement follows easily by setting $U(\bar{y}):=\bigcap_{i\in\mathcal{I}(\bar{y})}U^{i}(\bar{y})\cap\widetilde{U}(\bar{y})$ , where $\widetilde{U}(\bar{y})$ is a neighborhood of $\bar{y}$ such that $C\cap\widetilde{U}(\bar{y})=\bigcup_{i\in\mathcal{I}(\bar{y})}C^{i}\cap\widetilde{U}(\bar{y})$ . Clearly, the existence of $\widetilde{U}(\bar{y})$ is guaranteed by the closedness of $C^{i}$ $(i\notin\mathcal{I}(\bar{y}))$ .

In order to prove (ii), consider sequences $\{y^{k}\in C\}\to\bar{y}$ and $\{\lambda^{k}\in\widehat{N}_{C}(y^{k})\}\to\bar{\lambda}$ . From (22), closedness of $C^{i}$ and finiteness of $\mathcal{I}(\bar{y})$ one easily obtains the existence of $j\in\mathcal{I}(\bar{y})$ and a subsequence $\tilde{K}\subset\mathbb{N}$ such that

[TABLE]

The assumption now yields the existence of a subsequence $K\subset\tilde{K}$ such that $\left\langle\bar{\lambda}_{\nu},y_{\nu}^{k}-\bar{y}_{\nu}\right\rangle=0$ for $\nu\in I_{\delta}$ and $k\in K$ . ∎

Recall that if $\bar{\lambda}\notin N_{C^{i}}(\bar{y})$ for some $i\in\mathcal{I}(\bar{y})$ , then $C^{i}$ automatically satisfies (P2).

Proposition 4.3.

Let $C=\prod_{i=1}^{r}C_{i}$ with $C_{i}\subset\mathbb{R}^{d_{i}}$ $(i=1,\ldots,r)$ closed and $\bar{y}=(\bar{y}_{1},\ldots,\bar{y}_{r})\in C$ .

(i)

If $C_{i}$ satisfies (P1) at $\bar{y}_{i}$ for all $i=1,\ldots,r$ , then $C$ satisfies (P1) at $\bar{y}$ .

(ii)

If $C_{i}$ satisfies (P2) for multi-index $\delta_{i}$ with $|\delta_{i}|=d_{i}$ , the point $\bar{y}_{i}$ and $\bar{\lambda}_{i}$ for all $i=1,\ldots,r$ , then $C$ satisfies (P2) for $\delta=(\delta_{1},\ldots,\delta_{r})$ , $\bar{y}$ and $\bar{\lambda}=(\bar{\lambda}_{1},\ldots,\bar{\lambda}_{r})$ .

Proof.

Denoting by $U_{i}(\bar{y}_{i})\;(i=1,\ldots,r)$ the neighborhoods given by the assumption in (i), the first statement follows by simply setting $U(\bar{y}):=\prod_{i=1}^{r}U_{i}(\bar{y}_{i})$ and applying (25).

In order to prove (ii), consider sequences $\{y^{k}\in C\}\to\bar{y}$ and $\{\lambda^{k}\in\widehat{N}_{C}(y^{k})\}\to\bar{\lambda}$ . By (24), we have $\lambda_{i}^{k}\in\widehat{N}_{C_{i}}(y_{i}^{k})$ for every $i=1,\ldots,r$ and $k\in\mathbb{N}$ . By assumption, there exists a subsequence $K_{1}\subset\mathbb{N}$ with $\left\langle\bar{\lambda}_{1,\nu_{1}},y_{1,\nu_{1}}^{k}-\bar{y}_{1,\nu_{1}}\right\rangle=0$ $(\nu_{1}\in I_{\delta_{1}},k\in K_{1})$ . Consequently, by assumption, there exists a subsequence $K_{2}\subset K_{1}$ such that $\left\langle\bar{\lambda}_{2,\nu_{2}},y_{2,\nu_{2}}^{k}-\bar{y}_{2,\nu_{2}}\right\rangle=0$ $(\nu_{2}\in I_{\delta_{2}},k\in K_{2})$ . Repeating this argument another $r-2$ times, we find that there exists a subsequence $K(=K_{r})$ such that $\left\langle\bar{\lambda}_{i,\nu_{i}},y_{i,\nu_{i}}^{k}-\bar{y}_{i,\nu_{i}}\right\rangle=0$ $(\nu_{i}\in I_{\delta_{i}},k\in K)$ for all $i=1,\ldots,r$ . This proves the statement. ∎

We conclude this subsection by showing that the program (1), with $\Gamma$ satisfying properties (P1) and (P2), automatically satisfies the crucial Assumption 3.7, and in addition, that directional PQ-normality is equivalent to its simplified counterpart in this case. We point out that this result is the very foundation for all remaining results of the paper.

Proposition 4.4.

Let $\bar{x}$ be feasible for (1) with $\Gamma$ closed and satisfying (P1) at $\bar{y}=F(\bar{x})$ as well as (P2) for some multi-index $\delta$ , the point $\bar{y}=F(\bar{x})$ and every multiplier $\bar{\lambda}\in N_{\Gamma}(F(\bar{x}))$ . Then Assumption 3.7 for $\delta$ is fulfilled at $\bar{x}$ and, moreover, (directional) PQ-normality w.r.t. $\delta$ at $\bar{x}$ is equivalent to its simplified form (12) from Theorem 3.8 (Theorem 3.17).

Proof.

Assumption 3.7 for $\delta$ at $\bar{x}\in\mathcal{X}$ follows from (P2) for $\Gamma$ with $\delta$ at $\bar{y}=F(\bar{x})\in\Gamma$ . Hence, the statement for the nondirectional version follows from Theorem 3.8. Similarly, the implication from the directional simplified form to directional PQ-normality follows from Theorem 3.17.

It remains to show that PQ-normality w.r.t. $\delta$ in direction $u$ implies its simplified form. We do this by contraposition, so let us assume that there exists $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)\setminus\{0\}$ and $\{x^{k}\}\to\bar{x}$ such that $(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|\to u$ and

[TABLE]

By the definition of the directional normal cone, there exists $\{t_{k}\}\downarrow 0$ and $\{w^{k}\}\to\nabla F(\bar{x})u$ as well as $\{\lambda^{k}\in\widehat{N}_{\Gamma}(F(\bar{x})+t_{k}w^{k})\}\to\bar{\lambda}$ . Taking into account (P1) together with [56, Exercise 6.44] we obtain

[TABLE]

for any $\alpha>0$ sufficiently small. Hence by setting $y^{k}:=F(\bar{x})+\left\|x^{k}-\bar{x}\right\|w^{k}$ we conclude $\lambda^{k}\in\widehat{N}_{\Gamma}(y^{k})$ . Moreover, (P2) for $\delta$ yields that, by passing to a subsequence if necessary, we may take $y^{k}$ such that $\left\langle\bar{\lambda}_{\nu},y_{\nu}^{k}-F_{\nu}(\bar{x})\right\rangle=0$ , for all $\nu\in I_{\delta}$ and $k\in\mathbb{N}$ . Consequently, we obtain

[TABLE]

Finally, $(y^{k}-F(\bar{x}))/\left\|x^{k}-\bar{x}\right\|=w^{k}\to\nabla F(\bar{x})u$ , showing the violation of PQ-normality w.r.t. $\delta$ in direction $u$ and the proof is complete. ∎

4.2 Pseudo-normality for disjunctive programs

The desired results for the disjunctive setting (18) can be viewed as a corollary of our analysis in Section 4.1. Indeed, Lemma 4.1 and Proposition 4.2 yield that a disjunctive set $\Gamma$ satisfies properties (P1) and (P2) for the multi-index $\delta^{P}:=d$ . In particular, due to (P1), the endeavor of computing the normal cone to disjunctive $\Gamma$ at some point can be reduced to computing the normal cone to a union of finitely many polyhedral cones at zero, i.e.,

[TABLE]

see [35, p. 59]. More importantly, the following corollary is a consequence of Proposition 4.4.

Corollary 4.5.

Let $\Gamma$ be disjunctive in the sense of (18). Then $\Gamma$ satisfies (P1) at every point $\bar{y}\in\Gamma$ as well as (P2) for the multi-index $\delta^{P}:=d$ at every point $\bar{y}$ and every $\bar{\lambda}$ . In particular, Assumption 3.7 for $\delta^{P}$ is fulfilled at every feasible point $\bar{x}$ for disjunctive programs. Moreover, (directional) pseudo-normality at $\bar{x}$ is equivalent to its simplified form: (for any $u\in\mathbb{R}^{n}$ with $\left\|u\right\|=1$ ) there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ ( $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ ) such that there exists a sequence $\{x^{k}\}\to\bar{x}$ (with $(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|\to u$ ) fulfilling

[TABLE]

We emphasize that Corollary 4.5 clarifies that the various definitions of pseudo-normality used in the literature stem from the same concept. In the general setting (1), pseudo-normality contains the additional sequence $\{y^{k}\}$ , but in the special cases of disjunctive programs it reduces to the simplified version without $\{y^{k}\}$ .

Corollary 4.5 also allows us to use all the sufficient conditions for pseudo-normality, hence also for MSCQ, studied in Section 3. These conditions now take on simpler forms since the vector optimization techniques reduce to standard optimization in the disjunctive setting. This can be seen from (26), which yields that pseudo-normality of $\bar{x}$ is equivalent to $\bar{x}$ being a local maximizer of $\left\langle\bar{\lambda},F(x)\right\rangle$ for all $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , cf. Theorem 3.9. In particular, the second-order sufficient conditions from Corollary 3.10 and Proposition 3.19 read as follows.

Corollary 4.6.

Let $\bar{x}$ be feasible for (1) with $\Gamma$ disjunctive and $F$ twice differentiable at $\bar{x}$ . Consider the following two conditions:

(i)

second-order sufficient condition for pseudo-normality (SOSCPN): For every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ and every $0\neq u\in\mathbb{R}^{n}$ one has

[TABLE]

(ii)

second-order sufficient condition for directional pseudo-normality (SOSCdirPN): For every $u\in\mathbb{R}^{n}$ with $\left\|u\right\|=1$ and every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ one has (27).

Then condition (i) (condition (ii)) implies (directional) pseudo-normality at $\bar{x}$ . In particular, either of the two conditions implies MSCQ at ${\bar{x}}$ .

Clearly, an affine $F$ can never fulfill the strict inequality of SOSCPN. The required maximality of $\bar{x}$ expressed in (26) can be secured nonetheless.

Corollary 4.7.

Let $\bar{x}$ be feasible for (1) with $\Gamma$ disjunctive. If $F$ is affine then pseudo-normality, and consequently also MSCQ, holds at ${\bar{x}}$ .

Proof.

For $F$ affine we have $F(x)=F(\bar{x})+\nabla F(\bar{x})(x-\bar{x})$ for all $x\in\mathbb{R}^{n}$ . Hence, taking into account $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ we find that

[TABLE]

showing that $\bar{x}$ is a local maximizer of $\left\langle\bar{\lambda},F\right\rangle$ and pseudo-normality thus follows. ∎

We point out that the sufficiency of SOSCdirPN for MSCQ established in Corollary 4.6 corresponds to the sufficiency of Gfrerer’s SOSCMS for MSCQ (Proposition 2.7 (iii)). In turn, Corollary 4.7 corresponds to Robinson’s result (Proposition 2.7 (iv)). Hence, by employing the notion of (directional) pseudo-normality and its sufficiency for MSCQ, we found new proofs for these interesting results. Moreover, the notion of directional quasi-normality unifies all sufficient conditions for MSCQ from Proposition 2.7.

Note that the analogous results were obtained also in [3, Theorem 4.1., Proposition 4.2.]. What was not noticed there, however, is the underlying maximality principle (26), which provides a nice understanding and makes things much simpler. In particular, it enables us to extend the above results by means of higher-order analysis.

4.3 Higher-order conditions

In order to proceed, we rely once more on the notion of multi-indices. First, we introduce the following standard notation: Given $\alpha=(\alpha_{1},\ldots,\alpha_{n})\in\mathbb{N}^{n}$ and $x=(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ we set

[TABLE]

Given a function $g:\mathbb{R}^{n}\to\mathbb{R}$ , $m$ -times differentiable at $\bar{x}$ , and $\alpha\in\mathbb{N}^{n}$ with $|\alpha|\leq m$ we set

[TABLE]

Corollary 4.8.

Let $\bar{x}$ be feasible for a disjunctive program with $F$ $m$ -times differentiable at $\bar{x}$ . Consider the following two conditions:

(i)

for every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , $1\leq q<m$ , $w\in\mathbb{R}^{n}$ and all $0\neq u\in\mathbb{R}^{n}$ one has

[TABLE]

(ii)

for every $u\in\mathbb{R}^{n}$ with $\left\|u\right\|=1$ , $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ , $1\leq q<m$ and all $w\in\mathcal{U}$ , where $\mathcal{U}$ denotes a neighbourhood of $u$ , one has (28).

Then condition (i) (condition (ii)) implies (directional) pseudo-normality at $\bar{x}$ . In particular, either of the two conditions implies MSCQ at ${\bar{x}}$ .

Proof.

Both statements follows from the same arguments, namely, given $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ and $1\leq q<m$ and setting $u_{k}:=(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|$ , Taylor expansion together with (28) yield

[TABLE]

∎

Similarly as in the case of affine $F$ , the strict inequality of the above higher-order sufficient conditions does not have to be fulfilled, as long as $F$ has polynomial structure, i.e., for every $i=1,\ldots,d,$ and every $x,$ we have

[TABLE]

for some $m\in\mathbb{N}$ , denoting the degree of $F$ , and $c_{i,\alpha}\in\mathbb{R}$ . We point out that one actually has $c_{i,\alpha}=D^{\alpha}F_{i}(0)/\alpha!$ and (29) can be equivalently rewritten as

[TABLE]

for arbitrary $\bar{x}\in\mathbb{R}^{n}$ .

Corollary 4.9.

Let $\bar{x}$ be feasible for a disjunctive program with $F$ being polynomial of degree $m$ , i.e., given by (29). Consider the following two conditions:

(i)

for every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , $1\leq q\leq m,$ and for all $w\in\mathbb{R}^{n}$ one has

[TABLE]

(ii)

for every $u\in\mathbb{R}^{n}$ with $\left\|u\right\|=1$ , $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ , $1\leq q<m$ and all $w\in\mathcal{U}$ , where $\mathcal{U}$ denotes a neighbourhood of $u$ , one has (31).

Then condition (i) (condition (ii)) implies (directional) pseudo-normality at $\bar{x}$ . In particular, either of the two conditions implies MSCQ at ${\bar{x}}$ .

Proof.

Denoting $c_{\alpha}:=(c_{1,\alpha},\ldots,c_{d,\alpha})$ and taking into account (30), for any $\bar{\lambda}\neq 0$ , one has

[TABLE]

for every $x$ . Hence, given $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ and $1\leq q\leq m$ , both statements follows from (31) since

[TABLE]

∎

Of course the above higher-order conditions are sufficient for pseudo-normality and MSCQ also for general programs (1) fulfilling Assumption 3.7 for $\delta^{P}$ .

4.4 Summary and example for the disjunctive case

For the sake of completeness, we summarize the sufficient conditions for pseudo-normality and MSCQ in the disjunctive setting in the following theorem.

Theorem 4.10 (Sufficient conditions for pseudo-normality and MSCQ).

Consider (1) with $\Gamma$ disjunctive in the sense of (18) and a feasible point $\bar{x}$ . Then any of the conditions from Corollaries 4.5, 4.6, 4.7, 4.8 and 4.9 implies (directional) pseudo-normality and MSCQ at $\bar{x}$ .

The following parametric example demonstrates the usefulness of our conditions based on pseudo-normality.

Example 4.11.

Let $\Gamma\subset\mathbb{R}^{3}$ be given by $\Gamma:=\mathbb{R}\times\{y\in\mathbb{R}^{2}\,|\,y_{2}\leq-|y_{1}|\}$ , $F:\mathbb{R}^{2}\to\mathbb{R}^{3}$ defined by $F(x):=(x_{1},x_{2},ax_{1}^{2}+bx_{1}^{4}+cx_{2}^{2}+dx_{2}^{4})^{T}$ for some parameters $a,b,c,d\in\mathbb{R}$ and let $\bar{x}:=(0,0)$ . Clearly,

[TABLE]

and for any $\lambda=(\lambda_{1},\lambda_{2},\lambda_{3})\in\Lambda^{0}(\bar{x})=\Lambda^{0}(\bar{x};(\pm 1,0)^{T})=\mathbb{R}_{+}(0,0,1)^{T}$ . Note also that $\Lambda^{0}(\bar{x};u)=\emptyset$ for all directions $u\neq(\pm 1,0)^{T}$ with $\left\|u\right\|=1$ since $T_{\Gamma}(F(\bar{x}))=\Gamma$ and $\nabla F(\bar{x})u=(u_{1},u_{2},0)^{T}$ . Moreover, observe that $\mathcal{X}=F^{-1}(\Gamma)=\left\{x\in\mathbb{R}^{2}\,\left|\;ax_{1}^{2}+bx_{1}^{4}+cx_{2}^{2}+dx_{2}^{4}\leq-|x_{2}|\right.\right\}.$

The most crucial parameter is $a$ . Indeed, if $a>0$ , then locally around $\bar{x}=0$ , the set $\mathcal{X}$ is the singleton $\{0\}$ , and thus it can be seen that sequence $\{x_{k}:=(1/k,0)^{T}\}$ shows violation of MSCQ. On the other hand, if $a<0$ , MSCQ holds and can be verified by SOSCMS. Hence, suppose now that $a=0$ .

Next, let us look into parameter $b$ . If $b>0$ , $\{x_{k}:=(1/k,0)^{T}\}$ again shows violation of MSCQ regardless of other parameters. Note that if $b\leq 0<c$ , the sequence $\{\tilde{x}_{k}:=(1/k^{2},1/k^{3})^{T}\}$ satisfies $\tilde{x}_{k}/\left\|\tilde{x}_{k}\right\|\to(1,0)^{T}=:\bar{u}$ , but for $\bar{\lambda}:=(0,0,1)^{T}\in\Lambda^{0}(\bar{x};\bar{u})$ we get

[TABLE]

for sufficiently large $k$ , showing violation of pseudo-normality in direction $\bar{u}$ , hence we cannot use any of the stronger conditions to verify MSCQ. Clearly, a similar problem occurs if $b=c=0<d$ .

We conjecture that MSCQ holds in this case, but as the direct proof appears fairly technical and since for our purposes it is more interesting to see the limitations of sufficient conditions in this case (rather than determine if MSCQ holds), we skip the details for $b<0$ and only prove MSCQ in the simpler case $b=0$ below.

Let us mention, however, that if $b<0\geq c$ , we may use the directional version of the fourth-order sufficient condition based on Corollary 4.8 (ii) to verify MSCQ, even if $d>0$ .

Next, we prove that MSCQ holds if $b=0$ , regardless of parameters $c$ and $d$ . Since the feasible set $\mathcal{X}$ , locally around $\bar{x}=(0,0)$ , equals $\mathbb{R}\times\{0\}$ , we get ${\rm d}_{\mathcal{X}}(x)=|x_{2}|$ for any $x\in\mathbb{R}^{2}$ close enough to $\bar{x}$ . On the other hand, for $y,\tilde{y}\in\mathbb{R}^{3}$ with $y_{2}=\tilde{y}_{2}$ and $y_{3}\leq\tilde{y}_{3}$ we clearly have ${\rm d}_{\Gamma}(y)\leq{\rm d}_{\Gamma}(\tilde{y})$ . Given $\varepsilon\in(0,1)$ , let $x$ be sufficiently close to [math] so that $-\varepsilon|x_{2}|\leq cx_{2}^{2}+dx_{2}^{4}$ . One computes that

[TABLE]

Thus, setting $\kappa:=\frac{\sqrt{2}}{1-\varepsilon}$ yields

[TABLE]

for all $x\in\mathbb{R}^{2}$ close enough to $\bar{x}$ , and hence MSCQ follows.

In order to better illustrate the results of this paper, in the following tables corresponding to $a=0>b$ and $a=0=b$ , respectively, we provide sufficient conditions ensuring MSCQ for given parameters. Recall from above that, for $c>0$ and $b\leq 0$ , pseudo-normality-based conditions are not applicable, and thus we restrict ourselves to the case $c\leq 0$ . As mentioned above, the case $a=0>b$ can be handled by the directional fourth-order sufficient condition while for the case $a=0=b$ we provided the direct proof. We point out, however, that in both cases, unless $c=0<d$ , one can use also other sufficient conditions as indicated in the table. In particular, one can see the meaning of parameter $d$ , which does not seem to influence the validity of MSCQ, but it influences which sufficient conditions can be invoked to verify it. Note also that if $a<0$ , depending on other parameters, conditions other that SOSCMS can be used as well. Nevertheless, the only condition that can never be used in case $a=0$ is SOSCPN, which is applicable if $a,c<0$ . Hence, we further detail only the case $a=0$ .

To illustrate the difference between directional and non-directional approach, observe that the mildest non-directional sufficient condition for MSCQ, pseudo-normality, characterized by the maximality condition (26), is satisfied if and only if $a,c\leq 0$ and $a<0$ provided $b>0$ and $c<0$ if $d>0$ . On the other hand, on top of the above situations, directional pseudo-normality can be applied whenever $a<0$ or also in case $a=0>b$ and $c\leq 0$ .

The power of our new sufficient conditions is nicely demonstrated for $a=0$ , when Gfrerer’s SOSCMS can never be used. Similarly, Robinson’s result can not be applied unless all the parameters are zero.

5 Disjunctive programs with product structures

The simplified form of quasi-normality is not sufficient for metric subregularity even in case the set $\Gamma$ under consideration is a general convex polyhedral set, see Example 3.16. On the other hand, we realize that the set $\widetilde{\Gamma}$ in all cases (19) (a)-(e) is a union of products of closed intervals. This additional product structure motivates our study of ortho-disjunctive programs in Section 5.1, which enables us to recover and extend several known quasi-normality results for MPCCs and MPVCs and obtain new corresponding results for MPSCs, MPrCCs and MPrPCs.

In order to clarify the role of product structures in a broader context, consider first an instance of GMP (1), where

[TABLE]

for some multi-index $\delta\in\mathbb{N}^{l}$ with $l\in\{1,\ldots,d\}$ and $|\delta|=d$ , i.e., $\Gamma$ is the Cartesian product of disjunctive sets. Note that all the prototypical disjunctive programs from (19) (a)-(e) exhibit such “outer” product structure.

We emphasize that $\Gamma$ given by (32) is still a disjunctive set in the sense of (18). Indeed, denoting $\mathcal{J}:=\prod_{\nu\in I_{\delta}}\{1,\ldots,N_{\nu}\}$ , for $\vec{\boldsymbol{\ell}}\in\mathcal{J}$ the set $\Gamma^{\vec{\boldsymbol{\ell}}}:=\prod_{\nu\in I_{\delta}}\Gamma_{\nu}^{\ell_{\nu}}$ is convex polyhedral and $\Gamma=\bigcup_{\vec{\boldsymbol{\ell}}\in\mathcal{J}}\Gamma^{\vec{\boldsymbol{\ell}}}.$ Regardless, it turns out to be advantageous to exploit the underlying product structure of $\Gamma$ rather than just treating $\Gamma$ as a disjunctive set. One of the reasons is that we deal with the unions of only $N_{\nu}$ sets, which is typically a small number ( $N_{\nu}=2$ for all $\nu$ in all cases (19) (a)-(e)), instead of dealing with the union of $|\mathcal{J}|=\prod_{\nu\in I_{\delta}}N_{\nu}$ sets. We point out that the newly developed concept of $\mathcal{Q}$ -stationarity from [4, 5] takes advantage of this observation.

On the basis of Propositions 4.2 and 4.3 we readily infer that, on top of property (P1), $\Gamma$ given by (32) satisfies also (P2) for multi-index $\delta$ . Proposition 4.4 thus yields that in this case the (directional) PQ-normality w.r.t. $\delta$ coincides with its simplified form. In particular, standard NLPs, where $\Gamma=\{0\}^{r}\times\mathbb{R}_{-}^{d-r}$ for some $r\leq d$ , fit into (32) with the multi-index $\delta^{Q}:=(1,\ldots,1)\in\mathbb{N}^{d}$ and hence we can readily handle quasi-normality for NLPs with ease.

Utilizing the “outer” product structure on its own, however, does not enable one to analyze the quasi-normality for programs from (19) (a)-(e), where the factors $\Gamma_{\nu}=\widetilde{\Gamma}$ are two-dimensional. To overcome this, consider the GMP (1) with the “inner” product structure, i.e., where

[TABLE]

for some multi-index $\delta\in\mathbb{N}^{l}$ .

By the same arguments as before, $\Gamma$ again satisfies (P2) for $\delta$ and PQ-normality w.r.t. $\delta$ attains the simplified form. Moreover, the choice of multi-index $\delta^{Q}:=(1,\ldots,1)\in\mathbb{N}^{d}$ now offers richer setting.

5.1 Ortho-disjunctive constraints and quasi-normality

Motivated by the above discussion, we now introduce the new subclass of disjunctive programs containing the “inner” product structure with one-dimensional factors. To this end, consider the mathematical program of the form

[TABLE]

where $I=\{1,\ldots,d\}$ , $a_{i}^{\ell},b_{i}^{\ell}\in\mathbb{R}$ with $a_{i}^{\ell}\leq b_{i}^{\ell}$ and we also allow symbols $a_{i}^{\ell}=-\infty$ and $b_{i}^{\ell}=+\infty$ to include unbounded intervals. Note that we do not work with extended real numbers, i.e., given $a\in\mathbb{R}$ , $[a,\infty]$ stands for $\{x\in\mathbb{R}\,|\,x\geq a\}$ . This simply means that $\Gamma^{\ell}$ is a product of closed convex subsets of $\mathbb{R}$ , i.e., closed intervals. We refer to such sets $\Gamma$ as ortho-disjunctive and to such programs as mathematical programs with ortho-disjunctive constraints or briefly ortho-disjunctive programs.

Naturally, one can combine the “inner” and “outer” products and consider the Cartesian product of ortho-disjunctive sets, a setting that indeed fits the problem class (19) best. As before, it can be easily shown that such sets are still ortho-disjunctive. Moreover, only the “inner” products are important for our remaining analysis. Hence, we proceed without the “outer” product, which is also more consistent with the notion of disjunctive sets.

On the basis of Propositions 4.2, 4.3 and 4.4 we obtain the following analogon of Corollary 4.5.

Corollary 5.1.

Set $\Gamma$ given by (34) satisfies (P1) at every point $\bar{y}\in\Gamma$ as well as (P2) for multi-index $\delta^{Q}:=(1,\ldots,1)\in\mathbb{N}^{d}$ at every $\bar{y}$ and every $\bar{\lambda}$ . In particular, for ortho-disjunctive program (34), Assumption 3.7 for $\delta^{Q}$ is fulfilled at every feasible point $\bar{x}$ and, moreover, the (directional) quasi-normality at $\bar{x}$ is equivalent to its simplified form: (for any $u\in\mathbb{R}^{n}\setminus\{0\}$ ) there is no nonzero $\bar{\lambda}\in\Lambda^{0}(\bar{x})$ ( $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ ) such that there exists a sequence $\{x^{k}\}\to\bar{x}$ (with $(x^{k}-\bar{x})/\left\|x^{k}-\bar{x}\right\|\to u$ ) fulfilling

[TABLE]

Just as in the case of pseudo-normality, cf. the comments after Corollary 4.5, we have now clarified that, in fact, there is only one concept of quasi-normality which, in general, contains the additional sequence $\{y^{k}\}$ , but in special cases, such as NLPs or MPCCs, simplifies to the known versions without $\{y^{k}\}$ . Moreover, the above corollary provides the definition of quasi-normality for all other ortho-disjunctive programs.

Before we state the main result of this subsection that parallels Theorem 4.10 for pseudo-normality, we write down explicitly the conditions from Theorem 3.9, Proposition 3.19 and Corollary 3.10 for multi-index $\delta^{Q}$ corresponding to quasi-normality.

Given $\lambda=(\lambda_{i})_{i\in I}$ , $\varphi^{\lambda}$ from (14) reads as

[TABLE]

Moreover, assuming that $F$ is twice differentiable at $\bar{x}$ , the second-order sufficient conditions from Corollary 3.10 and Proposition 3.19, respectively, read as follows:

•

Second-order sufficient condition for quasi-normality (SOSCQN): For every $0\neq\bar{\lambda}\in\Lambda^{0}(\bar{x})$ , every $0\neq u\in\mathbb{R}^{n}$ with $\nabla F_{i}(\bar{x})u=0$ for all $i\in I(\bar{\lambda})$ and every $w\in\mathbb{R}^{n}$ with $\left\langle w,u\right\rangle=0$ one has

[TABLE]

•

Second-order sufficient condition for directional quasi-normality (SOSCdirQN): For every $u\in\mathbb{R}^{n}$ with $\|u\|=1$ , every $\bar{\lambda}\in\Lambda^{0}(\bar{x};u)$ with $\nabla F_{i}(\bar{x})u=0$ for all $i\in I(\bar{\lambda})$ and every $w$ with $\left\langle w,u\right\rangle=0$ one has (37).

Moreover, for a closed interval $[a,b]$ and $c\in\mathbb{R}$ we have

[TABLE]

where $(q)^{-}:=-\min\{q,0\}$ and $(q)^{+}:=\max\{q,0\}$ denotes the negative and the positive part of any number $q\in\mathbb{R}$ , respectively, extended to symbols $\pm\infty$ by the natural convention $(\infty)^{-}=(-\infty)^{+}=0$ . Thus, depending on which norm we consider for the products, the penalty function now reads as

[TABLE]

Theorem 5.2 (Sufficient conditions for quasi-normality and MSCQ).

Consider an ortho-disjunctive program (34) and a feasible point $\bar{x}$ . Then each of the following conditions implies (directional) quasi-normality and MSCQ at $\bar{x}$ : (i) the weak efficiency of $\bar{x}$ for $\varphi^{\lambda}$ from (36), (ii) SOSCQN from (37) (SOSCdirQN).

Let us briefly comment on the importance of the previous theorem (together with Corollary 5.1). First, consider only the statement that the (simplified form of) quasi-normality (35) implies MSCQ and hence M-stationarity and exactness of the penalty function (5.1) at local minimizers. For MPCCs, we thus recover the following results: [45, Theorem 3.3] (quasi-normality implies M-stationarity), [45, Lemma 4.3 and 4.4] (pseudo-normality implies MSCQ), [45, Theorem 4.5 and Corollary 4.6] (pseudo-normality implies exactness of $l_{1}$ and $l_{\infty}$ penalty function), as well as [60, Theorem 3.1] (quasi-normality implies MSCQ). Similarly, for MPVCs we recover and improve [41, Theorem 3.1] (pseudo-normality implies exactness of the penalty function) and the fact that quasi-normality implies M-stationarity, which is not stated in the paper, but follows directly from [41, Theorem 2.1 and Definition 2.3]. Moreover, to the best of our knowledge, pseudo- and quasi-normality were not yet introduced for MPSCs, MPrCCs and MPrPCs and all our results are hence new when applied to these problem classes.

Second, we also provide verifiable sufficient conditions for quasi-normality, together with sufficient conditions for pseudo-normality (higher-order conditions, polynomiality of $F$ ) from Section 4, which enhances the applicability of our results.

Finally, we open a path for a refined analysis using directional quasi-normality as well as all the corresponding sufficient conditions (SOSCdirQN etc.).

In order to illuminate and compare our results with the literature, we conclude this section with application to MPCCs. The same exercise could be executed for other classes (19) (b)-(e). Recall that, omitting standard equality and inequality constraints, an MPCC is given as

[TABLE]

The constraints of MPCCs fit the general setting $F(x)\in\Gamma$ with $F(x):=(G_{i}(x),H_{i}(x))_{i\in V}$ , and $\Gamma:=\Gamma_{\text{CC}}^{|V|}$ , where $\Gamma_{\text{CC}}=(\mathbb{R}_{+}\times\{0\})\cup(\{0\}\times\mathbb{R}_{+})$ is clearly ortho-disjunctive. As we mentioned, $\Gamma$ itself is also ortho-disjunctive, but we choose to rather keep the “outer” product as well, noting that the impact is only visible at the penalty function. We point out that the standard approach to MPCCs is to consider $\Gamma:=-\Gamma_{\text{CC}}^{|V|}$ and $F(x):=(-G_{i}(x),-H_{i}(x))_{i\in V}$ in order to work with nonnegative signs of certain multipliers, while in our case we obtain the opposite sign restrictions.

A simple computation yields that for $(G,H)\in\Gamma_{\text{CC}}$ we have

[TABLE]

Hence, denoting

[TABLE]

for some feasible point $\bar{x}$ , we conclude that $\lambda=(\lambda_{i}^{G},\lambda_{i}^{H})_{i\in V}\in N_{\Gamma_{\text{CC}}^{|V|}}(F(\bar{x}))$ if and only if

[TABLE]

Consequently, Corollary 5.1 yields that $\bar{x}$ satisfies quasi-normality provided there is no nonzero $\bar{\lambda}=(\bar{\lambda}_{i}^{G},\bar{\lambda}_{i}^{H})_{i\in V}$ fulfilling

[TABLE]

together with (39) such that there exists a sequence $\{x^{k}\}\to\bar{x}$ with

[TABLE]

On the other hand, $\bar{x}$ is M-stationary provided there exists $\bar{\lambda}=(\bar{\lambda}_{i}^{G},\bar{\lambda}_{i}^{H})_{i\in V}$ satisfying (39) and

[TABLE]

Moreover, using first the $l_{1}$ -norm to handle the “outer” product we get

[TABLE]

Next, using the $l_{\infty}$ -norm for the “inner” product, for arbitrary $(G,H)\in\mathbb{R}^{2}$ we have ${\rm d}_{\Gamma_{\text{CC}}}(G,H)=|\min\{G,H\}|$ . Note that this agrees with the corresponding expression from (5.1), which reads as $\min\big{\{}\max\{(G)^{-},|H|\},\max\{|G|,(H)^{-}\}\big{\}}$ . Consequently, we obtain

[TABLE]

Conclusion

Building on recently developed directional techniques from variational analysis, this paper contains a complex and self-contained study of the metric subregularity constraint qualification (MSCQ) for broad classes of nonconvex optimization problems including, most importantly, disjunctive programs. Our findings reveal a common denominator of several prominent sufficient conditions for MSCQ occurring in the literature. Thus, our study improves understanding of these seemingly independent approaches and provides an additional insight. Moreover, it offers a wider spectrum of sufficient conditions for MSCQ, including point-based ones, and consequently also improves existing sufficient conditions. Furthermore, by introducing the new notion of ortho-disjunctive programs we established an appropriate framework for a unified study of several nonconvex optimization problems such as mathematical programs with complementarity, vanishing or switching constraints. These ortho-disjunctive programs hence provide an intriguing area for future research.

Acknowledgments

The authors also thank two anonymous referees for their comments which helped improve the presentation of the material.

Dedication

The authors would like to dedicate this paper to Helmut Gfrerer in honor of his 60th birthday.

Funding

The research of the first author was supported by the Austrian Science Fund (FWF) under grant P29190-N32. The work on the revised version was supported by the FWF grant P32832-N. The research of the second author was supported by the Grant Agency of the Czech Republic (Grant No. 18-04145S). Part of this work was done while the second author was visiting McGill University, partially supported by H2020-MSCA-RISE project GEMCLIME-2020 under GA No. 681228. The research of the third author was supported by an NSERC discovery grant.

Bibliography60

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Achtziger, W., Kanzow, C.: Mathematical programs with vanishing constraints: Optimality conditions and constraint qualifications. Math. Program. 114 , 69–99 (2008)
2[2] Adam, L., Branda, M.: Nonlinear Chance Constrained Problems: Optimality Conditions, Regularization and Solvers. J. Optim. Theory Appl. 170 (2), 419–436 (2016)
3[3] Bai, K., Ye, J. J., Zhang, J.: Directional quasi-/pseudo-normality as sufficient conditions for metric subregularity. SIAM J. Optim., 29 (4), 2625–2649 (2019)
4[4] Benko, M., Gfrerer, H.: On estimating the regular normal cone to constraint systems and stationary conditions. Optimization 66 , 61–92 (2017)
5[5] Benko, M., Gfrerer, H.: New verifiable stationary concepts for a class of mathematical programs with disjunctive constraints. Optimization 67 , 1–23 (2018)
6[6] Benko, M.: Numerical methods for mathematical programs with disjunctive constraints. Ph D-thesis, Univ. Linz (2016).
7[7] Benko, M., Gfrerer, H., Outrata, J. V.: Calculus for directional limiting normal cones and subdifferentials. Set-Valued Var. Anal., 27 (3), 713–745 (2019)
8[8] Benko, M., Gfrerer, H., Mordukhovich, B. S.: Characterizations of tilt-stable minimizers in second-order cone programming. SIAM J. Optim., 29 (4), 3100–3130 (2019)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Sufficient conditions for metric subregularity of constraint systems with applications to disjunctive and ortho-disjunctive programs

Abstract.

Key words and phrases:

1 Introduction

2 Preliminaries

2.1 Variational analysis

Proposition 2.1** (Subdifferentiation of distance function).**

2.2 Constraint qualifications

Definition 2.2** (MSCQ).**

Definition 2.3** (Constraint qualifications).**

Lemma 2.4**.**

Corollary 2.5**.**

Proof.

Example 2.6**.**

Proposition 2.7** (Sufficient conditions for MSCQ).**

3 New constraint qualifications for GMP

3.1 Directional constraint qualifications and PQ-normality

Lemma 3.1**.**

Proof.

Theorem 3.2**.**

Proof.

Example 3.3**.**

Definition 3.4** (PQ-normality).**

Theorem 3.5**.**

3.2 Simplified CQs and second-order sufficient conditions: The standard case

Lemma 3.6**.**

Assumption 3.7**.**

Theorem 3.8** (Simplified PQ-normality under Ass. 3.7).**

Proof.

Theorem 3.9**.**

Proof.

Corollary 3.10** (Sufficient condition for PQ-normality).**

Proof.

Remark 3.11**.**

Definition 3.12**.**

Example 3.13**.**

Theorem 3.14**.**

Proof.

Example 3.15**.**

Example 3.16**.**

3.3 Simplified CQs and second-order sufficient conditions: The directional case

Theorem 3.17**.**

Proof.

Example 3.18**.**

Proposition 3.19** (SOSCdirPQN(δ\deltaδ)).**

Proof.

Remark 3.20**.**

Theorem 3.21**.**

3.4 Summary

4 Programs with disjunctive constraints

4.1 Key properties of convex polyhedral sets

Lemma 4.1**.**

Proof.

Proposition 4.2**.**

Proof.

Proposition 4.3**.**

Proof.

Proposition 4.4**.**

Proof.

4.2 Pseudo-normality for disjunctive programs

Corollary 4.5**.**

Corollary 4.6**.**

Corollary 4.7**.**

Proof.

4.3 Higher-order conditions

Corollary 4.8**.**

Proof.

Corollary 4.9**.**

Proof.

4.4 Summary and example for the disjunctive case

Theorem 4.10** (Sufficient conditions for pseudo-normality and MSCQ).**

Example 4.11**.**

5 Disjunctive programs with product structures

5.1 Ortho-disjunctive constraints and quasi-normality

Proposition 2.1 (Subdifferentiation of distance function).

Definition 2.2 (MSCQ).

Definition 2.3 (Constraint qualifications).

Lemma 2.4.

Corollary 2.5.

Example 2.6.

Proposition 2.7 (Sufficient conditions for MSCQ).

Lemma 3.1.

Theorem 3.2.

Example 3.3.

Definition 3.4 (PQ-normality).

Theorem 3.5.

Lemma 3.6.

Assumption 3.7.

Theorem 3.8 (Simplified PQ-normality under Ass. 3.7).

Theorem 3.9.

Corollary 3.10 (Sufficient condition for PQ-normality).

Remark 3.11.

Definition 3.12.

Example 3.13.

Theorem 3.14.

Example 3.15.

Example 3.16.

Theorem 3.17.

Example 3.18.

Proposition 3.19 (SOSCdirPQN( $\delta$ )).

Remark 3.20.

Theorem 3.21.

Lemma 4.1.

Proposition 4.2.

Proposition 4.3.

Proposition 4.4.

Corollary 4.5.

Corollary 4.6.

Corollary 4.7.

Corollary 4.8.

Corollary 4.9.

Theorem 4.10 (Sufficient conditions for pseudo-normality and MSCQ).

Example 4.11.

Corollary 5.1.

Theorem 5.2 (Sufficient conditions for quasi-normality and MSCQ).