Strong Convexity for Risk-Averse Two-Stage Models with Fixed Complete   Linear Recourse

Matthias Claus; Kai Sp\"urkel

arXiv:1812.08109·math.OC·December 20, 2018

Strong Convexity for Risk-Averse Two-Stage Models with Fixed Complete Linear Recourse

Matthias Claus, Kai Sp\"urkel

PDF

Open Access

TL;DR

This paper extends the understanding of strong convexity in two-stage risk-averse models with linear recourse, providing conditions for various risk measures and implications for stability and optimization algorithms.

Contribution

It introduces the concept of partial strong convexity and derives verifiable conditions for strong convexity in models with distortion risk measures.

Findings

01

Conditions for strong convexity in models with CVaR and distortion risk measures.

02

Implications for stability under probability measure perturbations.

03

Relevance for convergence rates in stochastic optimization algorithms.

Abstract

This paper generalizes results concerning strong convexity of two-stage mean-risk models with linear recourse to distortion risk measures. Introducing the concept of (restricted) partial strong convexity, we conduct an in-depth analysis of the expected excess functional with respect to the decision variable and the threshold parameter. These results allow to derive sufficient conditions for strong convexity of models building on the conditional value-at-risk due to its variational representation. Via Kusuoka representation these carry over to comonotonic and distortion risk measures, where we obtain verifiable conditions in terms of the distortion function. For stochastic optimisation models, we point out implications for quantitative stability with respect to perturbations of the underlying probability measure. Recent work in \cite{Ba14} and \cite{WaXi17} also gives testimony to the…

Figures2

Click any figure to enlarge with its caption.

Equations335

min {E_{ω} [h (ξ) + φ (z (ω) - T ξ)] : ξ \in X}

min {E_{ω} [h (ξ) + φ (z (ω) - T ξ)] : ξ \in X}

φ (t) = min {q^{⊺} y ∣ W y = t, y \geq 0}

φ (t) = min {q^{⊺} y ∣ W y = t, y \geq 0}

min {\hat{h} (x) + Q_{E} (x) ∣ x \in T (X)}

min {\hat{h} (x) + Q_{E} (x) ∣ x \in T (X)}

\hat{h} (x) = min {h (ξ) ∣ T ξ = x, ξ \in X}

\hat{h} (x) = min {h (ξ) ∣ T ξ = x, ξ \in X}

Q_{E} (x) = E_{μ} [φ (z - x)]

Q_{E} (x) = E_{μ} [φ (z - x)]

f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y) - \frac{κ}{2} λ (1 - λ) ∥ x - y ∥^{2} .

f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y) - \frac{κ}{2} λ (1 - λ) ∥ x - y ∥^{2} .

Q_{D_{+}} (x) = Q_{E} (x) + E_{μ} [max {0, φ (z - x) - Q_{E} (x)}]

Q_{D_{+}} (x) = Q_{E} (x) + E_{μ} [max {0, φ (z - x) - Q_{E} (x)}]

Q_{E E} (x, η) = E_{μ} [max {0, φ (z - x) - η}]

Q_{E E} (x, η) = E_{μ} [max {0, φ (z - x) - η}]

u \in K_{i} in f j sup (d_{i} - d_{j})^{⊺} u \geq α ∥ u ∥.

u \in K_{i} in f j sup (d_{i} - d_{j})^{⊺} u \geq α ∥ u ∥.

{r t_{s}^{i} ∣ r \geq 0} \cap {z ∣ d_{i}^{⊺} z = η} = {\overset{y}{^}_{s}^{i}}

{r t_{s}^{i} ∣ r \geq 0} \cap {z ∣ d_{i}^{⊺} z = η} = {\overset{y}{^}_{s}^{i}}

Q_{E E} (x) = {0 for x \geq 1 + ρ - α^{- 1} η \frac{1}{2 α} [α (1 + ρ - x) - η]^{2}, else.

Q_{E E} (x) = {0 for x \geq 1 + ρ - α^{- 1} η \frac{1}{2 α} [α (1 + ρ - x) - η]^{2}, else.

Q_{E E} (x_{1}, x_{2})

Q_{E E} (x_{1}, x_{2})

+ η (1 - x_{1} - η)^{2} + \frac{1}{2} (1 - x_{1}) [(1 - x_{2} - η)^{2} - (1 - x_{1} - η)^{2}] .

\frac{\partial ^{2}}{\partial x _{1}^{2}} Q_{E E} (x_{1}, x_{2})

\frac{\partial ^{2}}{\partial x _{1}^{2}} Q_{E E} (x_{1}, x_{2})

\frac{\partial ^{2}}{\partial x _{2}^{2}} Q_{E E} (x_{1}, x_{2})

\frac{\partial ^{2}}{\partial x _{1} \partial x _{2}} Q_{E E} (x_{1}, x_{2})

Q_{\alpha CVaR}(x)=\min_{\eta\in\mathbb{R}}\big{\{}\eta+\frac{1}{\alpha}Q_{EE}(x,\eta)\big{\}}.

Q_{\alpha CVaR}(x)=\min_{\eta\in\mathbb{R}}\big{\{}\eta+\frac{1}{\alpha}Q_{EE}(x,\eta)\big{\}}.

Q_{\alpha VaR}(x)=\inf\Big{\{}t\in\mathbb{R}\;|\;\mu(\{z\;|\;\varphi(z-x)\leq t\})\geq 1-\alpha\Big{\}}

Q_{\alpha VaR}(x)=\inf\Big{\{}t\in\mathbb{R}\;|\;\mu(\{z\;|\;\varphi(z-x)\leq t\})\geq 1-\alpha\Big{\}}

Q_{α C V a R} (x) = Q_{α V a R} (x) + \frac{1}{α} Q_{E E} (x, Q_{α V a R} (x)) .

Q_{α C V a R} (x) = Q_{α V a R} (x) + \frac{1}{α} Q_{E E} (x, Q_{α V a R} (x)) .

f (λ (x_{1}, y_{1}) + (1 - λ) (x_{2}, y_{2})) \leq λ f (x_{1}, y_{1}) + (1 - λ) f (x_{2}, y_{2}) - \frac{κ}{2} λ (1 - λ) ∥ x_{1} - x_{2} ∥^{2}

f (λ (x_{1}, y_{1}) + (1 - λ) (x_{2}, y_{2})) \leq λ f (x_{1}, y_{1}) + (1 - λ) f (x_{2}, y_{2}) - \frac{κ}{2} λ (1 - λ) ∥ x_{1} - x_{2} ∥^{2}

[f^{'} (x_{1}, y_{1}) - f^{'} (x_{2}, y_{2})] ((x_{1}, y_{1}) - (x_{2}, y_{2})) \geq κ ∥ x_{1} - x_{2} ∥^{2}

[f^{'} (x_{1}, y_{1}) - f^{'} (x_{2}, y_{2})] ((x_{1}, y_{1}) - (x_{2}, y_{2})) \geq κ ∥ x_{1} - x_{2} ∥^{2}

\displaystyle Q_{\alpha CVaR}(\lambda x_{1}+(1-\lambda)x_{2})\;=\;\min_{\eta\in\mathbb{R}}\;\big{\{}\eta+\frac{1}{\alpha}Q_{EE}(\lambda x_{1}+(1-\lambda)x_{2},\eta)\big{\}}

\displaystyle Q_{\alpha CVaR}(\lambda x_{1}+(1-\lambda)x_{2})\;=\;\min_{\eta\in\mathbb{R}}\;\big{\{}\eta+\frac{1}{\alpha}Q_{EE}(\lambda x_{1}+(1-\lambda)x_{2},\eta)\big{\}}

\leq

\leq

=

Q_{E E} (x, η) = \int_{R} max {0, φ (z - x) - η} μ (d z)

Q_{E E} (x, η) = \int_{R} max {0, φ (z - x) - η} μ (d z)

Q_{E E} (x, η)

Q_{E E} (x, η)

= \int_{0}^{1 - x - η} t d t = \frac{1}{2} (1 - x - η)^{2} .

[Q_{E E}^{'} (x + u, η + ν) - Q_{E E}^{'} (x, η)] (u, ν) = u^{2} + 2 uν + ν^{2} = (u + ν)^{2}

[Q_{E E}^{'} (x + u, η + ν) - Q_{E E}^{'} (x, η)] (u, ν) = u^{2} + 2 uν + ν^{2} = (u + ν)^{2}

\displaystyle Q_{\alpha CVaR}(x)=\min_{\eta\in\mathbb{R}}\big{\{}\eta+\frac{1}{2\alpha}(1-x-\eta)^{2}\big{\}}.

\displaystyle Q_{\alpha CVaR}(x)=\min_{\eta\in\mathbb{R}}\big{\{}\eta+\frac{1}{2\alpha}(1-x-\eta)^{2}\big{\}}.

Q_{α C V a R} (x) = - x + \frac{1}{2} (2 - α),

Q_{α C V a R} (x) = - x + \frac{1}{2} (2 - α),

f (λ (x_{1}, y_{1}) + (1 - λ) (x_{2}, y_{2})) \leq λ f (x_{1}, y_{1}) + (1 - λ) f (x_{2}, y_{2}) - \frac{κ}{2} λ (1 - λ) ∥ x_{1} - x_{2} ∥^{2}

f (λ (x_{1}, y_{1}) + (1 - λ) (x_{2}, y_{2})) \leq λ f (x_{1}, y_{1}) + (1 - λ) f (x_{2}, y_{2}) - \frac{κ}{2} λ (1 - λ) ∥ x_{1} - x_{2} ∥^{2}

φ (u) \geq α^{'} ∥ u ∥,

φ (u) \geq α^{'} ∥ u ∥,

[Q_{E E}^{'} (x + u, η + ν) - Q_{E E}^{'} (x, η)] (u, ν)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Statistical Methods and Inference · Stochastic processes and financial applications

Full text

∎

11institutetext: M. Claus 22institutetext: University Duisburg-Essen

Thea-Leymann-Straße 9

D-45127 Essen

Tel.: +49 201 183 6887

22email: [email protected] 33institutetext: K. Spürkel 44institutetext: University Duisburg-Essen

Thea-Leymann-Straße 9

D-45127 Essen

Tel.: +49 201 183 6890

44email: [email protected]

Strong Convexity for Risk-Averse Two-Stage Models with Fixed Complete Linear Recourse

††thanks: The authors gratefully acknowledge the support of the German Research Foundation (DFG) within the collaborative research center TRR 154 “Mathematical Modeling, Simulation and Optimization Using the Example of Gas Networks”.

Matthias Claus

Kai Spürkel

(Received: date / Accepted: date)

Abstract

This paper generalizes results concerning strong convexity of two-stage mean-risk models with linear recourse to distortion risk measures. Introducing the concept of (restricted) partial strong convexity, we conduct an in-depth analysis of the expected excess functional with respect to the decision variable and the threshold parameter. These results allow to derive sufficient conditions for strong convexity of models building on the conditional value-at-risk due to its variational representation. Via Kusuoka representation these carry over to comonotonic and distortion risk measures, where we obtain verifiable conditions in terms of the distortion function. For stochastic optimisation models, we point out implications for quantitative stability with respect to perturbations of the underlying probability measure. Recent work in Ba14 and WaXi17 also gives testimony to the importance of strong convexity for the convergence rates of modern stochastic subgradient descent algorithms and in the setting of machine learning.

Keywords:

Two-Stage Stochastic Programming Linear Recourse Strong Convexity Conditional Value at Risk Comonotonic Risk Measures Stability

MSC:

90C15 90C31

1 Introduction

In S94 Schultz considered the expectation based two-stage optimisation problem

[TABLE]

where $X$ is a subset of some $\mathbb{R}^{n}$ , $h$ convex and real-valued, $z(\omega)$ a $s$ -dimensional random vector on some propability space $(\Omega,\mathcal{F},\mathbb{P})$ and $\varphi$ a recourse function of the form

[TABLE]

which is the value function of a linear program with parametric right-hand side. For a general introduction to such models we refer to the standard textbooks BL11 and SDR14 . The optimisation problem (1) can be rewritten as

[TABLE]

with

[TABLE]

and the reduced expectation function

[TABLE]

where $\mu$ denotes the pushforward measure of $z$ . It is well-known that under mild assumptions $Q_{\mathbb{E}}$ is well-defined and convex on all of $\mathbb{R}^{s}$ . For further structural analysis of the optimisation problem (1) (e.g. stability analysis, cf. section 4) and its algorithmic treatment with subgradient schemes (cf. Nes04 ) conditions for strong convexity may be desirable and for the risk-neutral setting were given in S94 , Theorem 2.2:

Theorem 1.1

Assume that the following conditions are satisfied:

A1

For every $t$ there exist some $y\geq 0$ such that $Wy=t$ . (Complete recourse)

A2

There exists some $v$ with $W^{\intercal}v<q$ . (Strengthened sufficiently expensive recourse)

A3

$\|z\|$ * is $\mu$ -integrable. (Finite first moments)*

A4

$\mu$ * has a density $\theta$ with respect to the Lebesgue-measure and there exists a convex open set $V$ , constants $r,\rho>0$ such that $\theta\geq r$ a.s. on $V+B_{\rho}(0)$ .*

Then $Q_{\mathbb{E}}$ is strongly convex on $V$ .

Remember that a real-valued function $f$ on some convex subset $V$ of a normed space is called $\kappa$ -strongly convex on that set if for all $x,y\in V$ and all $\lambda\in\left(0,1\right)$ it holds

[TABLE]

We point out that a constant of strong convexity for $Q_{\mathbb{E}}$ in Theorem 1.1 can be computed from the model data, i.e. the geometry of the set $\{v\ |W^{\intercal}v\leq q\}$ , $\rho$ and $r$ .

In CSS17 the analysis of (5) was extended to the upper semideviation based functional

[TABLE]

and the expected-excess based one

[TABLE]

which, for simplicity, we shall call upper semideviation and expected excess repectively. For the latter one an additional assumption A5 on the magnitude of $\eta$ is needed because if $\eta$ is too big, $Q_{EE}(x,\eta)$ might not even depend on $x$ anymore.

1.1 On strong convexity of $Q_{EE}(\cdot,\eta)$ for fixed $\eta$

In order to formulate condition A5 we note the following properties of the value function $\varphi$ and its linearity complex (cf. Lemma 32 and 34 in CSS17 ):

Lemma 1

Assume A1 and A2. Then $\{v\ |\ W^{\intercal}v\leq q\}$ is the convex hull of its finitely many extreme points $\{d_{i}\ |\ i\in I=\{1,\ldots,N\}\}$ and $\varphi$ has the following properties:

(i)

$\varphi(t)=\max_{i\in I}d_{i}^{\intercal}t$ , $\varphi(t)=d_{i}^{\intercal}t$ for all $t\in K_{i}=\{z\ |\ (d_{i}-d_{j})^{\intercal}z\geq 0\text{ for all }j\in I\}$ , i.e. $\varphi$ is finite and polyhedral.

(ii)

$\bigcup_{i\in I}K_{i}=\mathbb{R}^{s}$ * with each $K_{i}$ being a $s$ -dimensional, pointed polyhdral cone. Furthermore each $K_{i}\cap K_{j}$ with $i\neq j$ is a common closed face of $K_{i}$ and $K_{j}$ and it holds $\text{dim}(K_{i}\cap K_{j})=s-1$ if and only if $d_{i}$ and $d_{j}$ are adjacent.*

(iii)

There is some $\alpha>0$ such that

[TABLE]

Let us fix some more notation:

Since each $K_{i}$ is a polyhedral cone, we can write it as the conic hull of its finitely many extreme-rays, i.e. $K_{i}=\text{cone}\{t^{i}_{1},\ldots,t^{i}_{n_{i}}\}$ . With shorthand $K_{i}^{+}=K_{i}\cap\{z\ |\ d_{i}^{\intercal}z\geq 0\}$ and $I^{+}=\{i\in I\ |\ \text{int}K_{i}^{+}\neq\emptyset\text{ and }d_{i}\neq 0\}$ we note that for $\eta_{0}>0$ and $i\in I^{+}$ it holds that the hyperplane $\{z\ |\ d_{i}^{\intercal}z=\eta_{0}\}$ intersects at least one extreme ray of $K_{i}$ in a single point:

[TABLE]

for at least one $s\in\{1,\ldots,n_{i}\}$ . Let $\hat{y}^{i}(\eta_{0})$ denote one with minimum norm. Theorem 35 in CSS17 can then be formulated as this:

Theorem 1.2

Let A1-A4 hold. In addition assume

A5

$\eta_{0}$ * is such that for all $i\in I^{+}$ we have $\|\hat{y}^{i}(\eta_{0})\|<\rho$ (where $\rho$ is the one given in A4).*

Then $Q_{EE}(x,\eta)$ is strongly convex on $V$ (cf. A4 for the definition of $V$ ) with respect to $x$ for all $\eta\leq\eta_{0}$ . The modulus of strong convexity does not depend on $\eta$ .

The geometric situation is shown in Fig. 1.

In A5 it is in fact enough to show that for every $i\in I^{+}$ it holds $\|\hat{y}^{i}(\eta_{0})\|<\rho$ or if there exist an index set $J_{i}\subset I$ such that $-K_{i}\subset\bigcup_{j\in J_{i}}K_{j}^{+}$ it holds $\|\hat{y}^{j}(\eta_{0})\|<\rho$ for all $j\in J_{i}$ . In this paper we shall use the slightly less general version of A5.

Let us make three remarks on the theorem:

Firstly, it is desirable to verify condition A5 for $\eta_{0}$ as large as possible (especially when considering Theorem 2.3 later). The larger $\rho$ and $\|d_{i}\|$ with $i\in I^{+}$ , the larger $\eta_{0}$ can be chosen.

Secondly, condition A5 might not be fulfilled on the entire set $V$ . By considering a subset $U\subset V$ instead, allowing for a possibly bigger value of $\rho$ , one can ensure that condition A5 holds on $U$ at least. Hence, strong convexity of $Q_{EE}(\cdot,\eta)$ can be shown on a smaller set. We shall demonstrate these two remarks in Example 1 below.

Thirdly, a modulus of strong convexity $\kappa$ can be computed in terms of model-data directly with the techniques employed in CSS17 . It depends on $\rho$ , the shape of $\{W^{\intercal}v\leq q\}$ , the lower bound of $\mu$ ’s density $r$ and $\eta_{0}$ (all subsumized as ”model data”). There is a certain trade-off between how big $\kappa$ can be and how $\eta_{0}$ is chosen: The bigger $\eta_{0}$ , the smaller $\kappa$ . Example 2 illustrates this fact.

In Theorem 2.3 and Theorem 2.4 we will calculate moduli of strong convexity for $Q_{EE}$ in a more general setting than in Theorem 1.2.

Example 1

Consider $\varphi(t)=\max\{0,\alpha\,t\}$ and $\mu=\mathbb{U}_{|(-\rho,1+\rho)}$ . For arbitrary $x\in V=(0,1)$ we compute for $\eta,\alpha,\rho>0$

[TABLE]

We see that $Q_{EE}(\cdot,\eta)$ is strongly convex as long as the condition $0<x<\min\{1+\rho-\alpha^{-1}\eta,1\}$ is fulfilled. In notation of A5 there is only one cone $K_{i}^{+}$ with $i\in I^{+}$ which is $K=\mathbb{R}_{\geq 0}$ corresponding to $d=\alpha$ . We get $\hat{y}=\alpha^{-1}\eta_{0}$ and condition A5 reads $\alpha^{-1}\eta_{0}<\rho$ so that $Q_{EE}(\cdot,\eta)$ is strongly convex on $V$ whenever $\eta<\alpha\rho$ . If $\rho<\frac{1}{2}$ we may consider the set $U=(\rho,1-\rho)\subset V$ to get strong convexity for larger $\eta$ than before. We can set as the new $\tilde{\rho}=2\rho$ and see that $Q_{EE}(\cdot,\eta)$ is strongly convex on $U$ for all $\eta<2\alpha\rho$ .

Note that the modulus of strong convexity does not depend on $\rho$ or $\eta$ which in higher dimensional cases cannot be expected:

Example 2

Let $\varphi(t)=\max\{0,t_{1},t_{2}\}$ , $\mu=\mathbb{U}_{\left(0,1\right)}$ .

We shall also assume $0<x_{2}<x_{1}$ , $\eta>0$ and $x_{1}+\eta\leq 1$ when computing

[TABLE]

For the tedious computation we refer to the appendix. The components of the Hessian $\mathcal{H}$ of $Q_{EE}$ are

[TABLE]

If $\eta\rightarrow 1-$ , we can choose $x_{1}$ close to $1$ and $x_{2}$ close to [math] which gives $\text{det}\mathcal{H}_{Q_{EE}}\rightarrow 0+$ . Since the determinant of the Hessian is equal to the product of its Eigenvalues we see that at least one of them approaches [math] so that the modulus of strong convexity must depend on the choice of $\eta_{0}$ given in condition A5.

1.2 Variational representation of CVaR

In section 3 we shall consider the Conditional Value-at-Risk at a confidence level $\alpha\in(0,1)$ , which can be characterized by a minimisation problem in terms of $Q_{EE}$ :

[TABLE]

For smooth distributions this representation is due to (UR99, , Theorem 1), while (UR02, , Theorem 10) covers the general case and shows that the Value-at-Risk

[TABLE]

is an optimal solution to the above minimisation problem. Thus,

[TABLE]

We shall however favor working with representation (6) due to inconvenient properties of $Q_{\alpha VaR}$ , i.e. absence of convexity. While it is straightforward to show convexity of $Q_{\alpha CVaR}$ through (6) essentially due to joint convexity of $Q_{EE}$ in both arguments, strong convexity does not follow trivially even if $Q_{EE}(x,\eta)$ is strongly convex in $x$ with strong convexity constant not depending on $\eta$ . A property that ensures strong convexity of $Q_{\alpha CVaR}$ can be defined as follows:

Definition 1

Let $V\subset\mathbb{R}^{n}$ and $W\subset\mathbb{R}^{m}$ nonempty and convex.

A function $f:V\times W\rightarrow\mathbb{R}$ is called partially $\kappa$ -strongly convex with respect to its first argument if

[TABLE]

holds for all $x_{1},x_{2}\in V,\,y_{1},y_{2}\in W,\,0<\lambda<1$ .

Lemma 2

If $f$ (as above) is continuously differentiable then partial strong convexity is equivalent to

[TABLE]

for all $x_{1},x_{2}\in V,\,y_{1},y_{2}\in W,\,0<\lambda<1$ .

Although the proof is virtually the same as for strong convexity (cf. RhOr70 ) in both arguments, we shall give a variant of the proof in for the reader’s convenience in the appendix.

For the moment let us assume that $Q_{EE}(x,\eta)$ is partially strongly convex with respect to $x$ on some set $V\times W\subset\mathbb{R}^{s}\times\mathbb{R}$ . The following simple calculation shows that $Q_{\alpha CVaR}$ is strongly convex on $V$ with modulus $\frac{\kappa}{\alpha}$ if $Q_{\alpha VaR}(V)\subset W$ .

For any $\lambda\in[0,1]$ and $x_{1},x_{2}\in V$ set $\eta_{i}=Q_{\alpha VaR}(x_{i})\in W$ , $i=1,2$ . Then

[TABLE]

where the second inequality follows from $Q_{EE}$ being partially strongly convex.

One might hope that conditions A1-A5 suffice to prove partial strong convexity for $Q_{EE}$ . It turns out, that this is not true:

Example 3

As a counterexample consider

[TABLE]

with $\varphi(t)=\max\{0,t\}$ and $\mu=\mathbb{U}_{|\,\left(0,1\right)}$ (i.e. uniformly distributed on $\left(0,1\right))$ . Let $V\subset\left(0,1\right)$ so that conditions $A1-A4$ are satisfied. Also choose $\eta_{0}>0$ so that A5 holds. In case $\eta_{0}\geq\eta>0$ we get

[TABLE]

We calculate for $x,x+u\in V$

[TABLE]

as long as $\eta+\nu\geq 0$ . Since for small $\nu$ we can choose $u=-\nu$ , $Q_{EE}$ cannot be partially strongly convex in wrt. $x$ on the set $V\times\left(-\infty,\eta_{0}\right]$ .

In this example $Q_{EE}$ does not have the desired joint convexity properties on any open set contained in the support of $\mu$ .

Unsurprisingly, $Q_{\alpha CVaR}$ does not behave any better:

Example 4

With the same specifications as in example 3 we can calculate the conditional value at risk at some level $\alpha\in\left(0,1\right)$ for any $x\in V$ as (cf. (6))

[TABLE]

The unique minimizer is $\eta^{*}=Q_{\alpha VaR}(x)=1-\alpha-x$ . Note that we can restrict the minimisation to $\eta>0$ since the value at risk is obviously positive. We arrive at

[TABLE]

so there is no subset $U\subset V$ on which $Q_{\alpha CVaR}$ is strongly convex.

We shall give a rather strict condition on the value function $\varphi$ that yields partial strong convexity for the expected excess in section 3 without additional assumptions on the distribution of $z$ .

Before that we will introduce the even weaker concept of strong convexity, restricted partial strong convexity, which can be shown to hold for $Q_{EE}$ under less restrictive assumptions on the recourse function. This property can also be characterized by monotonicity of the gradient as in Lemma 2.

Definition 2

Let $V\subset\mathbb{R}^{n}$ and $W\subset\mathbb{R}^{m}$ nonempty and convex.

A function $f:V\times W\rightarrow\mathbb{R}$ is called restricted partially $\kappa$ -strongly convex on $\Omega\subset V\times W$ with respect to its first argument if

[TABLE]

for all $(x_{1},y_{1}),(x_{2},y_{2})\in\Omega$ and all $0<\lambda<1$ .

Note that $\Omega$ does not need to be a cylindrical subset of $V\times W$ as in Definition 1.

In section 3 we will show that conditions A1-A5 are sufficient for restricted partial strong convexity of $Q_{EE}$ on some nonempt set $\Omega\subset V\times\mathbb{R}$ . The estimates above showing that partial strong convexity of $Q_{EE}$ implies strong convexity of $Q_{\alpha CVaR}$ can be used verbatim to show that restricted partial strong convexity of $Q_{EE}$ does so as well - if one can additionally show that $(x,Q_{\alpha VaR}(x))\in\Omega$ for all $x\in V$ (or maybe for all $x\in U\subset V$ ). Example 4 shows that this cannot be done without only relying on assumptions A3 and A4 on the distribution of $z$ .

2 Joint properties of $Q_{EE}$

Theorem 1.2 addresses properties of $Q_{EE}(\cdot,\eta)$ for fixed threshold $\eta$ . In Theorem 2.3 we will show that in conjunction with A1-A5 the following condition is sufficient for partial strong convexity of $Q_{EE}(x,\eta)$ wrt. $x$ :

A6

It holds $q>0$ , i.e. the gradient of the objective function of the second stage is positive componentwise (cf. (2)).

Note that this condition is stronger than A2 (choose $v=0$ there). Although it is a rather strict condition on the problem data, it might be well justifiable in the setting of simple recourse problems because compensating actions in the second stage should have negative impact on the total objective.

If A6 does not hold, all that can be shown is restricted partial strong convexity in the sense of Definition 2. We start with an elementary lemma to provide some geometrical insights that are used within the proof of Theorems 2.3 and 2.4:

Lemma 2.1

Let A1 and A2 hold. Then A6 is fulfilled if and only if one of the following conditions is fulfilled:

(i)

There is some $\alpha^{\prime}>0$ such that for all $u\in\mathbb{R}^{s}$ it holds

[TABLE]

(ii)

For all $i\in I$ we have $d_{i}\in\text{int cone}\{d_{i}-d_{j}\ |\ d_{j}\text{ adj. to }d_{i}\}$ .

Proof

This follows directly from well-known separating hyperplane theorems.

Lemma 2.2

Let A1-A4 hold. Then $Q_{EE}$ is continuously differentiable and we have the following formula:

[TABLE]

with suitable parametric sets $I(u,\nu)$ sets $M_{l}(x,\eta)\subset\mathbb{R}^{s}$ to be constructed in the proof below.

Proof

By assumptions A1-A4 and standard arguments $Q_{EE}$ is continuously differentiable on $\mathbb{R}^{s}\times\mathbb{R}$ , so only (2.2) warrants a proof. We calculate

[TABLE]

with shorthand $M_{i}(x,\eta)=(x+K_{i})\cap\{\varphi(z-x)\geq\eta\}$ .

Let $M_{0}(x,\eta)=\{\varphi(z-x)\leq\eta\}$ and

[TABLE]

Define random variables $Y_{1}$ and $Y_{2}$ taking values $y^{1}_{i}$ with probability $\pi^{1}_{i}$ and $y^{2}_{i}$ with probability $\pi^{2}_{i}$ respectively. We observe that the quantities in (2) can be rewritten as Riemann-Stieltjes integrals with cdfs $F_{Y_{1}},F_{Y_{2}}$ as integrators:

[TABLE]

Integration by parts yields

[TABLE]

Note that the boundary terms cancel out because $y^{1}_{i}=y^{2}_{i}$ for all $i$ .

Introducing the index set

[TABLE]

and observing that the sets $M_{i}(x,\eta)$ only meet in lower dimensional sets (if they meet at all) and thus $\mu(M_{i}(x,\eta)\cap M_{j}(x,\eta))=0$ for $i\neq j$ , we can write down the cdfs $F_{Y_{1}}$ and $F_{Y_{2}}$ as follows:

[TABLE]

Note that $F_{Y_{1}}-F_{Y_{2}}$ can be written as the measure of a difference of sets if we can show that the following inclusion holds for all $\tau\in\mathbb{R}$ :

[TABLE]

First note that for all $\tau$ we have

[TABLE]

for if it was not true there would be some $\bar{z}$ contained in the union to the left side of the inclusion but not in the right. This means there is some index $i_{1}$ with $\bar{z}\in x+u+K_{i_{1}}$ such that $-d_{i_{1}}^{\intercal}u-v\leq\tau$ . Since the cones $x+K_{i}$ cover the entire space (cf. Lemma 1 (ii) ) there is some index $i_{2}$ such that $\bar{z}\in x+K_{i_{2}}$ and $-d_{i_{2}}^{\intercal}u-v>\tau$ . By using the definition of the sets $K_{i}$ we arrive at the contradiction

[TABLE]

Back to (14) we see that this inclusion reduces to (15) in case $\tau\geq 0$ which was just discussed. Now let $\tau<0$ and $\bar{z}\in M_{i_{1}}(x+u,\eta+\nu)$ for some $i_{1}\in I(u,\nu)(\tau)$ which implies $\bar{z}\in x+u+K_{i_{1}}$ , $-d_{i_{1}}^{\intercal}u-\nu\leq\tau<0$ and $d_{i_{1}}^{\intercal}(\bar{z}-x-u)\geq\eta+\nu$ . This yields

[TABLE]

Since by (15) we also have $\bar{z}\in x+K_{i_{2}}$ for some $i\in I(u,\nu)(\tau)$ we have shown that $\bar{z}\in\bigcup_{i\in I(u,\nu)(\tau)}M_{i}(x,\eta)$ and (14) is proven. We can now replace $F_{Y_{1}}-F_{Y_{2}}$ and conclude the proof with

[TABLE]

∎

Since $Q_{EE}$ is continuously differentiable we can prove (restricted) partial strong convexity of $Q_{EE}$ by showing that (9) (and its restricted counterpart) holds for $Q_{EE}$ , i.e. showing that there exists some $\kappa>0$ with

[TABLE]

for relevant $x,\eta,u,\nu$ .

This is done by restricting the area of integration in (2.2) to some subset with measure not smaller than $\alpha\|u\|$ for some constant $\alpha>0$ . Then the $\mu$ -measure of the set within the integrand will be estimated from below by constructing a cylindrical subset with Lebesgue measure not smaller than $\alpha^{\prime}\|u\|$ (with some other constant $\alpha^{\prime}>0$ ) and which is contained in $V+B_{\rho}(0)$ , where a lower bound on $\mu$ ’s Lebesgue-density is available. We begin with the special case employing condition A6:

Theorem 2.3 (Partial strong convexity of $Q_{EE}$ )

*Let A1-A6 hold.

Then $Q_{EE}(x,\eta)$ is partially strongly convex on the set $V_{\eta_{0}}=V\times\left(-\infty,\eta_{0}\right]$ wrt. $x$ .*

Proof

Crucially relying on A6 we may assume $\nu\geq 0$ , otherwise substitute $x^{\prime}=x+u,\eta^{\prime}=\eta+\nu$ and $u^{\prime}=-u,\nu^{\prime}=-\nu$ and consider $u^{\prime}$ and $\nu^{\prime}$ instead of $u$ and $\nu$ . Since the sets $K_{l}$ cover the entire space $\mathbb{R}^{s}$ we can pick an index $i$ such that $u\in K_{i}$ . Note that in condition A5 and the discussion before it, which will come into play soon, we have $I=I^{+}$ and $K_{l}^{+}=K_{l}$ for all indices $l\in I$ .

We shall now construct some $\eta^{-}>0$ and give the desired estimate of (2.2) for $-\infty<\eta\leq\eta^{-}$ first. By Lemma 1 (iii) there is some index $j$ different from $i$ such that

[TABLE]

Assume that $\tau\in(-d_{i}^{\intercal}u-\nu,-d_{j}^{\intercal}u-\nu)$ . This implies $i\in I(u,\nu)(\tau)$ and is implied by $k\in I(u,\nu)(\tau)$ for all $k$ with $d_{k}^{\intercal}u>d_{j}^{\intercal}u$ . We thus find the inclusion

[TABLE]

where the set on the right-hand side does not depend on $\tau$ anymore. We want to estimate the $\mu$ -measure of this set:

Remember that $K_{i}$ is a pointed cone and therefore has finitely many facets $\{F^{i}_{j}\}_{j=1,\ldots,f_{i}}$ and finitely many extreme-rays $\{rt^{i,j}_{k}\ |\ k=1,\ldots,g_{i,j}\}$ adjacent to facet $F^{i}_{j}$ . For notational convenience set

[TABLE]

Intersecting $F^{i}_{j}$ with a hyperplane $\{d_{i}^{\intercal}z=\eta^{-}\}$ - this really is a hyperplane since $i\in I^{+}$ - yields points $\{y^{j}_{1}(\eta^{-}),\ldots,y^{j}_{r_{j}}(\eta^{-})\}$ where the hyperplane meets the extreme rays of $K_{i}$ adjacent to $F^{i}_{j}$ :

[TABLE]

Among all $y^{j}_{k}(\eta^{-})$ pick some $\hat{y}^{j}(\eta^{-})$ with

[TABLE]

Choose $\eta^{-}>0$ such that

[TABLE]

which is possible since $\|\hat{y}^{j}(\eta^{-})\|\rightarrow 0$ as $\eta^{-}\rightarrow 0+$ , and let $\tilde{\rho}=\rho-\max_{j=1,\ldots,f_{i}}\|\hat{y}^{j}(\eta^{-})\|$ .

For all $-\infty<\eta\leq\eta^{-}$ we can now show the inclusions

[TABLE]

To this end let $x+z+\lambda u$ with $z\in F^{+}_{i,j}(0,\eta^{-})\cap B_{\tilde{\rho}}(\hat{y}^{j}(\eta^{-}))$ and $0\leq\lambda<1$ be given. It holds $x+\lambda u\in V$ due to the convexity of $V$ . Furthermore

[TABLE]

It remains to be shown

[TABLE]

Due to $x+z\in F^{+}_{i_{j}}(x,\eta^{-})\subset x+K_{i}$ and $u\in K_{i}$ we have $x+z+\lambda u\in x+K_{i}$ , since $K_{i}$ is a convex cone. We also have

[TABLE]

which establishes $x+z+\lambda u\in M_{i}(x,\eta)$ . Let $k$ be any index in $I$ such that $d_{k}^{\intercal}u>d_{j}^{\intercal}u$ . We will show that $x+z+\lambda u$ does not even lie in $x+u+K_{k}$ :

[TABLE]

where we used the fact that $z\in F^{i}_{j}$ , implying $d_{k}^{\intercal}z\leq d_{i}^{\intercal}z=d_{j}^{\intercal}z$ , and $\lambda<1$ .

As the last prerequisite step we want to show that there exists some constant $\beta>0$ such that

[TABLE]

First note that we may as well set $x=0$ due to the translation invariance of the Lebesgue measure. The set is then the Lebesgue measure of a cylindrical set with bases $F^{+}_{i,j}(0,\eta^{-})\cap B_{\tilde{\rho}}(\hat{y}^{j}(\eta^{-}))$ and $F^{+}_{i,j}(0,\eta^{-})\cap B_{\tilde{\rho}}(\hat{y}^{j}(\eta^{-}))+u$ .

Let

[TABLE]

This constant still depends on the index $j$ which in turn depends on the direction $u\in K_{i}$ . Since there are only finitely many possible choices of $j$ we can robustify by picking the minimal $\beta$ .

The estimate (23) then follows from (17) and Cavalieri’s principle. As a side-remark - which comes into play when trying to maximize the constant of partial strong convexity - we note that the function

[TABLE]

is continuous, monotonically decreasing and tends to $\lambda_{s-1}\big{(}F^{i}_{j}\cap B_{\rho}(0)\big{)}$ as $\eta\rightarrow 0+$ .

We have gathered all necessary information to continue in (2.2) as

[TABLE]

In the first inequality the nonnegativity of the integrand and in the only equality the translation invariance of the Lebesgue measure was used.

Choose now some $0<\eta^{-}\leq\eta^{+}$ with

[TABLE]

and set

[TABLE]

Now consider the case $\eta^{-}<\eta\leq\eta^{+}$ :

We will choose $\left(-d_{i}^{\intercal}u-\nu,0\right)$ as the area of integration in (2.2). As the integration variable $\tau$ satisfies $-d_{i}^{\intercal}u-\nu<\tau<0$ we can find a subset of the one in (2.2) as

[TABLE]

In analogy with the notation used in the previous case set

[TABLE]

It holds

[TABLE]

Inclusion (27) follows the same way as (21). The proof for (28) works similar to the one for (22) with minor modifications:

Let $x+z+\lambda u$ with $z\in K_{i}\cap\{d_{i}^{\intercal}z^{\prime}=\eta\}$ and $0\leq\lambda<1$ . Again $x+z+\lambda u\in x+K_{i}$ due to $K_{i}$ being a convex cone. Furthermore

[TABLE]

because $d_{i}^{\intercal}z=\eta$ and $d_{i}^{\intercal}u\geq 0$ . Thus $x+z+\lambda u\in M_{i}(x,\eta)$ has been established. Pick any index $k\in I$ with $-d_{k}^{\intercal}u-\nu<0$ . It holds

[TABLE]

so we have $x+z+\lambda u\notin M_{k}(x+u,\eta+\nu)$ . In the last inequality we have used $\nu\geq 0$ .

As the last ingredient we will show that there exist constants $\alpha^{\prime},\beta^{\prime}>0$

[TABLE]

To this end set

[TABLE]

which is positive by construction. Applying Lemma 2.1 (i) and Cavalier’s principle yields (29). Having a lower bound on the measure of the set $F_{i,0}(0,\eta^{-})\cap B_{\tilde{\rho}}(\hat{y}^{i}(\eta^{-}))$ is the reason why we need to have to distinct the cases $-\infty<\eta<\eta^{-}$ and $\eta^{-}\leq\eta\leq\eta^{+}$ in the first place.

With this we can continue (2.2) as

[TABLE]

As before, nonnegativity of the integrand is used in the first step. Translation invariance of the Lebesgue measure is exploited in the only equation.

All constants $\alpha,\alpha^{\prime},\beta$ and $\beta^{\prime}$ computed until now implicitly depend on the index $i$ for which we have $u\in K_{i}$ . But since the index set $I$ is finite so is the number of such constants. Choosing minimal constants thus concludes the proof of the first part. ∎

That fact that the sets $M_{l}(x,\eta)$ have a simple geometry, i.e. they are pointed cones or truncated cones, is one of the key arguments in the proof (cf. Fig. 2 below). This geometry allowed us to explicitly construct cylindrical sets ((22) resp. (27)) for the estimates needed after (2.2). The size of these cylindrical sets is dictated by the model data, e.g. $\rho$ , and a certain degree of freedom when choosing $\eta^{-}$ and $\eta^{+}$ . With a little more effort the constants $\beta$ and $\beta^{\prime}$ which depend on the choice of $\eta^{-}$ and $\rho$ explicitly (and on $d_{1},\ldots,d_{N}$ implicitly) can be maximized to yield a partial strong convexity constant as large as possible. In principle the modulus of partial strong convexity resp. of restricted partial strong convexity (in the next theorem) can thus be computed directly in terms of model data. The remarks after Theorem 1.2. also apply in the setting of Theorem 2.3.

As a last remark to the preceding theorem we point out that once a $\eta^{-}>0$ has been fixed, the arguments in the proof can be modified to show that A6 implies strong convexity of $Q_{EE}$ in both arguments with strong convexity constant depending on $\eta^{-}$ .

Next we will consider the more general case when A6 fails to hold and prove restricted partial strong convexity of $Q_{EE}$ wrt. the first argument in the sense of Definition 2. In the last theorem $u$ and $\nu$ varied independently of each other, but as example 3 shows, we need to make some new assumptions which tie together $u$ and $\nu$ in order for restricted partial strong convexity to hold. This necessitates more case distinctions and technicalities, mainly due to the following two facts:

The interplay between choosing the area of integration in (2.2) and constructing suitable subsets of the set in the integrand become more subtle.

2.

The geometry of the sets $M_{i}^{+}$ can be slightly more complicated.

To avoid being overly repetitive in the proof of the next theorem, we will borrow notation from the preceeding one and mostly point out where changes need to be made to the last proof to accomodate the new situation, i.e. where two main consequences of A6 - being able to reflect to $\nu\geq 0$ when necessary and the lower bound on $d_{i}^{\intercal}u$ from (10) - were used.

We feel that it is still convenient to separate the two theorems: Firstly, to not obscure the general structure of the proof by even more case distinctions. Secondly, because most of the discussions in section 3 only make use of Theorem 2.3 anyway.

We start with some simple geometric observations which apply when A6 is discarded:

Before Theorem 1.2. we introduced the notation $K_{i}^{+}$ and $I^{+}$ . For $\eta>0$ we had observed that for $i\in I^{+}$ the hyperplane $\{d_{i}^{\intercal}z=\eta\}$ intersects $K_{i}$ (and also $K_{i}^{+}$ ) in some of its extreme rays rays. We denote a point of intersection (say with $\{r\,t^{i}\ |\ r\geq 0\}$ ) having minimum norm among all such points as $\hat{y}^{i}$ . Assume that $t^{i}$ has norm $1$ and set $\gamma=d_{i}^{\intercal}t^{i}$ . We then have the estimate

[TABLE]

The hyperplane $\{d_{i}^{\intercal}z=\eta\}$ also slices $K_{i}$ into two polyhedra $M_{i}^{+}(0,\eta)=K_{i}\cap\{d_{i}^{\intercal}z\geq\eta\}$ and $M_{i}^{-}(0,\eta)=K_{i}\cap\{d_{i}^{\intercal}z\leq\eta\}$ . Denote with $I^{\pm}$ indices $i\in I^{+}$ such that both polyhedra are unbounded - it is $I^{\pm}\neq\emptyset$ if and only if A6 fails to hold - and with $I^{++}$ indices in $I^{+}$ such that only $M_{i}^{+}(0,\eta)$ is unbounded

which holds iff there is some $\alpha^{\prime}>0$ with

[TABLE]

Obviously $I^{++}\cap I^{\pm}=\emptyset$ and $I^{++}\cup I^{\pm}=I^{+}$ .

For $i\in I^{\pm}$ we note that by inequality (31) we can write

[TABLE]

with two full-dimensional polyhedral cones $K_{i}^{++}=K_{i}^{+}\cap\{u\ |\ d_{i}^{\intercal}u\geq\gamma^{\prime}\|u\|\}$ and $K_{i}^{\pm}=K_{i}^{+}\cap\{u\ |\ d_{i}^{\intercal}u\leq\gamma^{\prime}\|u\|\}$ (choose for example $\gamma^{\prime}=\frac{\gamma}{2}$ ).

Theorem 2.4 (Restricted partial strong convexity of $Q_{EE}$ )

*Let A1-A5 hold.

Then $Q_{EE}(x,\eta)$ is restricted partially strongly convex on the set*

[TABLE]

where $\delta=\max\{\alpha^{\prime}_{i},\gamma_{j}^{\prime}\ |\ i\in I^{++},j\in I^{\pm}\}$ with $\alpha^{\prime}_{i}$ and $\gamma_{j}^{\prime}$ from (32) and (33). For the definition of $V_{\eta_{0}}$ cf. Theorem 2.3 above.

Proof

Let $(x,\eta),(x+u,\eta+\nu)\in\Omega$ so that we have

[TABLE]

as required by the definition of $\Omega$ . In (16) we may, after a suitable change of variables, assume that $u\in K_{i}^{+}$ with $i\in I^{+}=I^{++}\cup I^{\pm}$ . We can however not take $\nu\geq 0$ as granted. Since the case $i\in I^{++}$ is structurally more similar to the ones already treated, we shall start with this one. Drawing on condition A5 choose again $0<\eta^{-}\leq\eta^{+}$ so that conditions (20) and (25) are fulfilled. Let us first consider the case $\eta^{-}\leq\eta\leq\eta^{+}$ :

We need to choose the area of integration in (2.2) differently as before: This time it shall be

[TABLE]

Consequently we need to replace (26) by

[TABLE]

Inclusions (27) and (28) must be replaced by

[TABLE]

where only

[TABLE]

needs justification:

For any $0\leq\lambda<\frac{1}{2}$ , $z\in F_{i,0}(x,\eta)$ and index $k\in I$ with $-d_{k}^{\intercal}u<\frac{\alpha}{3}\|u\|-d_{i}^{\intercal}u$ it holds

[TABLE]

With (35) and (36) at hand the remaining estimates are analogous to the ones made before. The case $-\infty<\eta\leq\eta^{-}$ is identical to the one in (i).

For $i\in I^{\pm}$ we shall construct $\eta^{-}$ a little different than before.

Let us also assume that the set $K_{i}^{-}=K_{i}\cap\{z\ |\ d_{i}^{\intercal}z\leq 0\}$ has nonempty interior. The other case can be handled in a similar way.

We then find that for $\eta<0$ the hyperplane $\{d_{i}^{\intercal}z=\eta\}$ intersects the extreme rays of $K^{-}$ in singletons, let $\hat{y}^{i}=\hat{y}^{i}(\eta)$ denote one point of intersection with minimum norm. Choose $\eta^{-}$ ¡ 0 so that $\|\hat{y}^{i}(\eta^{-})\|<\rho$ and set $\tilde{\rho}=\rho-\|\hat{y}^{i}(\eta^{-})\|$ .

For arbitrary $-\infty<\eta\leq\eta^{-}$ we see that for all $j$ with $d_{j}$ adjacent to $d_{i}$ it holds

[TABLE]

and

[TABLE]

there first inclusion holding true since $\|\hat{y}^{i}\|(\eta)$ is monotonically increasing in $\eta$ , the second one bcause $\eta^{-}<0$ .

With this and resorting to (17) we can be continue (2.2) using as area of integration $-d_{i}^{\intercal}u-\nu<\tau<-d_{j}^{\intercal}u-\nu$ . In (21) and (22) the left hand side needs to be replaced by

[TABLE]

everything else is straightforward from thereon.

Now consider $\eta^{-}\leq\eta\leq\eta^{+}$ and - employing (33) - look at the cases

$u\in K_{i}^{+}\cap\{d_{i}^{\intercal}u^{\prime}\leq\gamma^{\prime}\|u^{\prime}\|\}$ and $u\in K_{i}^{+}\cap\{d_{i}^{\intercal}u^{\prime}\geq\gamma^{\prime}\|u^{\prime}\|\}$ separately.

For each of the two cases the area of integration in (2.2) and estimates for the integrands (as seen in (18), (21) and (22)) must be done appropriately as demonstrated before. ∎

3 CVaR based models

We shall now discuss implications of the preceding results for models extending (1) by replacing the expectation-functional with the the conditional value at risk $\alpha CVaR$ :

[TABLE]

By translation equivariance of $\alpha CVaR$ and the same arguments as above we can rewrite this problem as

[TABLE]

with

[TABLE]

Theorem 2.3 and the discussion after Lemma 9 yields

Theorem 3.1 (Strong convexity of $Q_{\alpha\,CVaR}$ )

Assume A1-A6 (in particular, there is some $\eta_{0}>0$ satisfying A5) and the following condition

[TABLE]

Then $Q_{\alpha\,CVaR}$ is $\frac{\kappa}{\alpha}$ -strongly convex on $V$ with $\kappa$ being the modulus of partial strong convexity for $Q_{EE}$ for $\eta\leq\eta_{0}$ . ∎

Let us make some remarks on this theorem:

Since $\alpha\mapsto Q_{\alpha VaR}(Y)$ is nonincreasing for fixed $Y$ , condition (40) will hold

for all $\alpha\leq\alpha^{\prime}\leq 1$ if it holds for $\alpha$ . It follows that $Q_{\alpha^{\prime}CVaR}$ is strongly convex for all such $\alpha^{\prime}$ .

There is some heuristic on when one can hope for $Q_{\alpha\,CVaR}$ to be strongly convex: We have that $Q_{\alpha CVaR}\equiv Q_{\mathbb{E}}$ for $\alpha=1$ which is strongly convex given the usual assumptions made above. If $1\geq\alpha\geq\alpha_{0}$ for some $\alpha_{0}$ which is not too close to [math] condition (40) might still hold. When $\alpha\rightarrow 0+$ the quantity $Q_{\alpha VaR}(x)$ will increase and condition (40) might be violated.

If not on $V$ it might still be possible to show strong convexity on some subset $U$ of $V$ for two reasons: Firstly, on $U$ conditions A5 is weaker since a larger $\rho$ can be chosen so that condition A5 holds. Secondly, in (40) we have the obvious estimate $\sup_{x\in V}Q_{\alpha VaR}(x)\geq\sup_{x\in U}Q_{\alpha VaR}(x)$ . We give an academic example to illustrate these points in the appendix, cf. example 5 there.

If $0\in V$ , the following (very rough) upper bound for the value-at-risk might also be helpful: Set $d=\max_{i\in I}\|d_{i}\|$ , then for any $x\in U\subseteq V$ and $z\in\mathbb{R}^{s}$ we have

[TABLE]

Thus,

[TABLE]

The above quantity is finite by the tightness of the probability measure $\mu$ and the boundedness of $U$ . Set $\bar{\eta}:=\inf\left\{t\;|\;\mu(B_{t}(0))\geq\alpha\right\}$ . As direct consequence of the above considerations, we obtain the following:

Proposition 1

If $Q_{EE}(\cdot;d\bar{\eta}+\epsilon)$ is $\kappa$ -strongly convex on $V$ for some $\epsilon>0$ , then $Q_{\alpha CVaR}$ is strongly convex with modulus $\frac{\kappa}{\alpha}$ on any nonempty, open, convex $U\subseteq V$ satisfying

[TABLE]

As a consequence of theorem 2.4 we get

Proposition 2

Assume A1-A5, (40) and the condition

[TABLE]

for all $x,y\in V$ with $\delta$ as defined in theorem 2.4. Then $Q_{\alpha\,CVaR}$ is $\frac{\kappa}{\alpha}$ -strongly convex on $V$ with $\kappa$ being the modulus of partial strong convexity for $Q_{EE}$ for $\eta\leq\eta_{0}$ .

3.1 Coherent risk measures and spectral risk measures

With verifiable conditions for strong convexity of CVaR-based models at hand, we shall now consider risk measures that can be represented as mixtures of CVaRs, so called coherent risk measures. For a general discussion of such functionals we refer to ADEH99 and FS11 .

Definition 3

Let $\mathcal{Z}=L_{1}(\Omega,\mathcal{F},\mathbb{P})$ . A proper function $\rho:\mathcal{Z}\rightarrow\overline{\mathbb{R}}$ is called a coherent risk measure if it satisfies the following four properties:

(1)

$\rho(tZ+(1-t)Z)\leq t\rho(Z)+(1-t)\rho(Z)$ for all $Z\in\mathcal{Z},\,0\leq t\leq 1$ . (Convexity)

(2)

$\rho(Z)\leq\rho(Z^{\prime})$ for $Z,Z^{\prime}\in\mathcal{Z}$ such that $Z\leq Z^{\prime}$ holds $\mathbb{P}$ -almost surely. (Monotonicity)

(3)

$\rho(a+Z)=a+\rho(Z)$ for all $Z\in\mathcal{Z},\,a\in\mathbb{R}$ . (Translation equivariance)

(4)

$\rho(t\,Z)=t\rho(Z)$ for all $Z\in\mathcal{Z},\,t>0$ . (Positive homogeneity)

Theorem 3.2 (Kusuoka Kusuoka2001 )

Assume $(\Omega,\mathcal{F},\mathbb{P})$ is nonatomic and $\rho$ is a law-invariant, coherent risk measure on $\mathcal{Z}$ . Then we have for any $Z\in\mathcal{Z}$

[TABLE]

where $\mathfrak{M}$ is a set of probability measures on the interval $[0,1)$ .

As in the preceding paragraphs we consider as probability space $(\Omega,\mathcal{F},\mathbb{P})=(\mathbb{R}^{s},\mathcal{B}(\mathbb{R}^{s}),\mu)$ which clearly is non-atomic due to $\mu$ having a Lebesgue-density. Now consider the random variables $Z_{x}(z)=\varphi(z-x)$ and the induced functional

[TABLE]

This gives the induced Kusuoka representation

[TABLE]

Formula (42) makes theorem 3.1 applicable to derive sufficient conditions for strong convexity of $Q_{\rho}$ if information on $\mathfrak{M}$ is available. In the scope of this paper we shall only investigate comonotonic risk measures appearing in the Kusuoka representation as

[TABLE]

for some probability measure $\nu$ on $\left(0,1\right]$ (cf. (Shapiro2013, , Theorem 2)). In particular, we shall consider measures $\nu_{g}$ induced by continuous, increasing, concave distortion functions $g:\left[0,1\right]\rightarrow\left[0,1\right]$ with $g(0)=0,g(1)=1$ . These are defined on half open intervals as

[TABLE]

if $t\in\left(0,1\right)$ and

[TABLE]

else (cf. BK17 and FS11 , section 4.6). For a comprehensive treatment of distortion functions and risk measures we refer to BGM09 , Pflug2006 and Tsukahara09 . Theorem 3.1 yields criteria for strong convexity of $Q_{\nu}$ and $Q_{\nu_{g}}$ :

Corollary 1 (Strong convexity for comonotonic risk measures)

*Assume that

$Q_{\alpha_{0}\,CVaR}$ is strongly convex on some nonempty, open convex set $V$ with modulus of strong convexity $\kappa$ for some $0<\alpha_{0}<1$ and that the following inequality is fulfilled:*

[TABLE]

Then $Q_{\nu}$ is strongly convex on $V$ with modulus $\kappa\,c$ . If $\nu=\nu_{g}$ is generated by a distortion function, condition (45) is equivalent to

[TABLE]

Proof

Let $x,y\in V$ and $0<\lambda<1$ . By splitting the integral into two and using convexity resp. strong convexity of the integrands we get

[TABLE]

For distortion risk measures we have

[TABLE]

which completes the proof. ∎

To illustrate Corollary 1, we shall discuss condition (46) for various distortion functions.

The expectation is generated by the distortion function $g_{\mathbb{E}}(t):=t$ . By $1-g(t)+tg^{\prime}(t)=1$ for all $t$ , condition (46) is fulfilled with $c=1$ . However, the assumption of strong convexity of $Q_{\alpha_{0}CVaR}$ for some $\alpha_{0}\in(0,1)$ is generally more restrictive than the assumptions of Theorem 1.1.

The distortion function associated with the conditional value at risk $Q_{\alpha\,CVaR}$ is defined by $g_{\alpha\,CVaR}(t):=\min\{\frac{t}{\alpha},1\}$ . We have

[TABLE]

Thus, (46) does not constitute an additional assumption for the conditional value at risk.

The Wang Transform distortion is given by $g_{\beta\,WT}(t):=\Phi(\Phi^{-1}(x)-\beta)$ , where $\beta>0$ is a parameter and $\Phi$ denotes the cdf. of the standard normal distribution (cf. Wang2000 ). For arbitrary $t\in(0,1)$ , we calculate

[TABLE]

Consequently, condition (46) is always fulfilled for the Wang Transform.

The mappings $g_{\gamma\,PH}(t):=t^{\gamma}$ with $\gamma\in(0,1]$ form the parametrized familiy of Proportional Hazard distortion functions. For any feasible $\gamma$ and any $t\in(0,1)$ , we have

[TABLE]

which means that condition (46) holds for any Proportional Hazard distortion function.

The Lookback distortion is given by $g_{\gamma\,LB}(t):=t^{\gamma}(1-\gamma\ln(t))$ , where $\gamma\in(0,1]$ is a parameter. For any $t\in(0,1)$ , we calculate

[TABLE]

Thus, condition (46) is fulfilled.

4 Stability

While we have only considered $Q_{\alpha\,CVaR}$ and $Q_{\nu_{g}}$ as functions of the first-stage decision variable $x$ so far, these quantities also depend on the underlying probability measure $\mu$ . In stochastic programming, incomplete information about the true underlying distribution or the need for computational efficiency may lead to optimisation models that employ an approximation of $\mu$ . Stability analysis deals with the behaviour of optimal values and optimal solution sets of the perturbed models in comparison to the original one.

First, we shall recall some relevant results concerning stability and strong convexity of abstract parametric programs of the form

[TABLE]

where $g:\mathbb{R}^{n}\times T\to\mathbb{R}$ and $X:\mathcal{T}\rightrightarrows\mathbb{R}^{n}$ are functions and $t$ varies in some metric space $(\mathcal{T},\mathrm{d}_{\mathcal{T}})$ . With inequality constraints and differentiable data, stability analysis for (P(t)) goes back to Alt Alt1983 , while a more general setting is considered by Klatte in Klatte1987 . For constant feasible set $X(t)\equiv X\subseteq\mathbb{R}^{n}$ for all $t\in\mathcal{T}$ , a proof of the following result is also given in RoemischSchultz1989 .

Lemma 3

Let $X$ be some nonempty, closed, convex subset of $\mathbb{R}^{n}$ and consider the mapping $\Psi:T\rightrightarrows\mathbb{R}^{n}$ given by

[TABLE]

Suppose that $t_{0}\in\mathcal{T}$ is such that the following conditions are satisfied:

S1

$g(\cdot,t)$ * is convex for all $t$ in a neighborhood of $t_{0}$ .* 2. S2

There exists a bounded open set $Q\subset\mathbb{R}^{n}$ such that $\Phi(t_{0})\subseteq Q$ . 3. S3

There exist $x_{0}\in\Psi(t_{0})$ and $\alpha:\mathbb{R}^{n}\to[0,\infty)$ such that $\alpha(0)=0$ and

[TABLE] 4. S4

There exists a constant $L>0$ such that

[TABLE]

holds for all $x\in\mathrm{cl}\;Q$ and all $t$ in a neighborhood of $t_{0}$ .

Then there exists a neighborhood of $t_{0}$ on which we have $\Psi(t)\neq\emptyset$ and

[TABLE]

In the presence of strong convexity, assumptions S2 - S4 above can be weakened.

Lemma 4

Let $X\subseteq\mathbb{R}^{n}$ be nonempty, closed and convex and suppose that $t_{0}\in\mathcal{T}$ is such that S1 and the following conditions are satisfied:

C1

$g(\cdot,t_{0})$ * is $\kappa$ -strongly convex on some open convex set $V$ with $\Psi(t_{0})\cap V\neq\emptyset$ .* 2. C2

There is a constant $L>0$ such that (48) holds for all $t$ in a neighborhood of $t_{0}$ and all $x$ in a neighborhood of $\Psi(t_{0})$ .

Then there exists a neighborhood of $t_{0}$ on which we have $\Psi(t)\neq\emptyset$ and

[TABLE]

Proof

Conditions S1 and C1 imply that $\Psi(t_{0})$ is a singleton $\{x_{0}\}\subset V$ . By C2 and the openess of $V$ , there thus exist constants $L>0$ and $r>0$ such that (48) holds for all $t$ in a neighborhood of $t_{0}$ and all $x\in\mathrm{cl}\;B_{r}(x_{0})\subset V$ . In particular, setting $Q:=B_{r}(x_{0})$ , assumptions S2 and S4 of Lemma 3 are fulfilled. By C1, $g_{t_{0}}(\cdot):=g(\cdot,t_{0})$ is $\kappa$ -strongly convex on $V$ and (GoebelRockafellar2008, , Proposition 4.2) yields

[TABLE]

for all $d\in\partial g_{t_{0}}(x_{0})$ and all $x\in V$ . As $x_{0}$ minimizes $g(\cdot,t_{0})$ over $X$ , there is a subgradient $d_{0}\in\partial g_{t_{0}}(x_{0})$ such that $d_{0}^{\top}(x-x_{0})$ is nonnegative for all $x\in X$ . Consequently, (50) implies

[TABLE]

for all $x\in X\cap Q$ . Choosing $\alpha(\cdot):=\frac{1}{2}\kappa\|\cdot\|$ we obtain S3. Lemma 3 yields the existence of a neighborhood $T_{0}$ of $t_{0}$ on which $\Psi(t)\neq\emptyset$ and (49) hold. Thus,

[TABLE]

holds for all $t\in T_{0}$ .

Returning to stochastic programming models, we shall endow the parameter space of Borel probability measures on $\mathbb{R}^{s}$ with finite moments of order $p\geq 1$

[TABLE]

with the $p-$ th order Wasserstein distance (cf. RaRue98 , Vil03 and Vil09 )

[TABLE]

To make the dependence of $Q_{\alpha\,CVaR}$ on the underlying measure explicit, we shall consider the mapping $Q_{\alpha CVaR}:\mathbb{R}^{s}\times\mathcal{M}^{1}_{s}\to\mathbb{R}$ definded by

[TABLE]

Let $g:\left[0,1\right]\rightarrow\left[0,1\right]$ be some continuous, increasing, concave distortion function with $g(0)=0,g(1)=1$ such that the mapping $Q_{\nu_{g}}:\mathbb{R}^{n}\times\mathcal{M}^{p}_{s}\to\mathbb{R}$ ,

[TABLE]

with $\nu_{g}\in\mathcal{P}((0,1])$ given by (43) and (44) is well defined. We shall consider the parametric optimisation problem

[TABLE]

where $X$ is some subset of $\mathbb{R}^{n}$ , $T:\mathbb{R}^{n}\to\mathbb{R}^{s}$ is linear and the mapping $\hat{h}$ is given by (4). Let $\Psi_{g}:\mathcal{M}^{p}_{s}\rightrightarrows\mathbb{R}^{n}$ ,

[TABLE]

denote the optimal solution set mapping of (P( $\mu$ )).

Theorem 4.1 (Quantitative Stability of (P( $\mu$ )))

Let $X\subseteq\mathbb{R}^{n}$ be nonempty, closed and convex and let $\mu_{0}\in\mathcal{M}^{p}_{s}$ be such that $Q_{\nu_{g}}(\cdot,\mu_{0})$ is $\kappa$ -strongly convex on some nonempty, open convex set $V$ satisfying $\Psi_{g}(\mu_{0})\cap V\neq\emptyset$ . Furthermore, assume that

[TABLE]

is finite. Then there exists a constant $r>0$ such that for any $\mu\in\mathcal{M}^{p}_{s}$ satisfying $d_{p}(\mu,\mu_{0})\leq r$ we have $\Psi_{g}(\mu)\neq\emptyset$ and

[TABLE]

Proof

By (Pichler2013, , Corollary 12), the first part of Lemma 1 and finiteness of $\mathcal{L}$ imply

[TABLE]

for any $\mu,\mu^{\prime}\in\mathcal{M}^{p}_{s}$ . In addition, the linearity of $T$ implies that $T(X)$ is nonempty, closed and convex. The result is thus a direct consequence of Lemma 4.

5 Appendix

Example 2: Details

[TABLE]

We directly get

[TABLE]

The calculation of (B) is a little more involved:

[TABLE]

For the treatment of $(C)$ use that $x_{1}+\eta\leq 1$ which implies $x_{2}+\eta\leq 1$ :

[TABLE]

We thus get

[TABLE]

Proof of Lemma 9:

Proof

We shall first show that (9) implies

[TABLE]

Set $t_{k}=\frac{k}{m+1}$ for $k=0,\ldots,m+1$ and some arbitrary integer $m$ . Let $z_{1}=(x_{1},y_{1})$ and $z_{2}=(x_{2},y_{2})$ . By the mean-value theorem we get $t_{k}<s_{k}<t_{k+1}$ such that

[TABLE]

This yields

[TABLE]

Since we have

[TABLE]

this shows (51) which is the same as

[TABLE]

Now let $z=\lambda z_{1}+(1-\lambda)z_{2}$ . (52) yields

[TABLE]

Multiplying by $\lambda$ and $(1-\lambda)$ respectively and adding up gives

[TABLE]

due to the first bracketed term vanishing. Rearranging terms yields (8).

Example 5

We compute the $\alpha CVaR$ for an elementary example via representation (6).

Consider $\varphi(t)=\max\{t,-t\}$ , $\mu=\frac{1}{2}\mathbb{U}_{|(0,1)}+\frac{1}{2}\mathbb{U}_{|(1.5,2.5)}$ , $x\in(0,1)$ and $\eta>0$ :

For the expected excess we get

[TABLE]

In order to calculate $Q_{\alpha VaR}(x)$ we first compute

[TABLE]

For given $x$ and $\alpha$ now determine the minimal $t^{*}$ for which $\mu(\varphi(z-x)\leq t)\geq 1-\alpha$ , i.e. $t^{*}=Q_{\alpha VaR}(x)$ . For $0<\alpha<\frac{1}{4}$ we get $Q_{\alpha VaR}(x)>1$ on the entire set $V$ , plugging this into (6) exhibits $Q_{\alpha CVaR}$ to be nowhere strongly convex on $V$ . For values of $\alpha$ close to $1$ and $x$ close to [math] one also sees $Q_{\alpha CVaR}$ failing to be strongly convex. This is due to the fact that $Q_{\alpha VaR}$ is decreasing in a neighborhood of [math]. For $\frac{1}{4}<\alpha<\frac{1}{2}$ one can show strong convexity on $U=(\frac{5}{4}-\alpha,1)$ .

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) W. Alt, Lipschitzian perturbations of infinite optimization problems , in Mathematical Programming with Data Perturbations II (ed. A.V. Fiacco), M. Dekker, New York and Basel, pp. 7-21 (1983).
2(2) P. Artzner, F. Delbaen, J.-M. Eber, D. Heath, Coherent Measures of Risk , Math. Finance, 9, pp. 203–228 (1999).
3(3) F. Bach, Adaptivity of Averages Stochastic Gradient Descent to Local Strong Convexity for Logistic Regression , Journal of Machine Learning Research 15, pp. 595-627 (2014).
4(4) A. Balbás, J. Garrido, S. Mayoral, Properties of distortion risk measures , Methodol. Comput. Appl. Probab., 11, pp. 385–399 (2009).
5(5) D. Belomestny, V. Krätschmer, Optimal stopping under probability distortions and law invariant coherent risk measures , Mathematics of Operations Research 42, pp. 806-833 (2017).
6(6) J. R. Birge, F. Louveaux, Introduction to Stochastic Programming , second edition, Springer, New York (2011).
7(7) M. Claus, R. Schultz, K. Spürkel, Strong Convexity for Stochastic Programming with Deviation Risk-Measures , Computational Management Science, 15(3), pp. 411-429 (2018).
8(8) H. Foellmer, A. Schied, Stochastic Finance , extended edition, Walter de Gruyter & Co., Berlin (2011).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Strong Convexity for Risk-Averse Two-Stage Models with Fixed Complete Linear Recourse

Abstract

Keywords:

MSC:

1 Introduction

Theorem 1.1

1.1 On strong convexity of QEE(⋅,η)Q_{EE}(\cdot,\eta)QEE​(⋅,η) for fixed η\etaη

Lemma 1

Theorem 1.2

Example 1

Example 2

1.2 Variational representation of CVaR

Definition 1

Lemma 2

Example 3

Example 4

Definition 2

2 Joint properties of QEEQ_{EE}QEE​

Lemma 2.1

Proof

Lemma 2.2

Proof

Theorem 2.3** (Partial strong convexity of QEEQ_{EE}QEE​)**

Proof

Theorem 2.4** (Restricted partial strong convexity of QEEQ_{EE}QEE​)**

Proof

3 CVaR based models

Theorem 3.1 (Strong convexity of Qα CVaRQ_{\alpha\,CVaR}QαCVaR​)

Proposition 1

Proposition 2

3.1 Coherent risk measures and spectral risk measures

Definition 3

Theorem 3.2 (Kusuoka Kusuoka2001 )

Corollary 1 (Strong convexity for comonotonic risk measures)

Proof

4 Stability

Lemma 3

Lemma 4

Proof

Theorem 4.1 (Quantitative Stability of (P(μ\muμ)))

Proof

5 Appendix

Example 2: Details

Proof of Lemma 9:

Proof

Example 5

1.1 On strong convexity of $Q_{EE}(\cdot,\eta)$ for fixed $\eta$

2 Joint properties of $Q_{EE}$

Theorem 2.3 (Partial strong convexity of $Q_{EE}$ )

Theorem 2.4 (Restricted partial strong convexity of $Q_{EE}$ )

Theorem 3.1 (Strong convexity of $Q_{\alpha\,CVaR}$ )

Theorem 4.1 (Quantitative Stability of (P( $\mu$ )))