On Risk-Averse Stochastic Semidefinite Programs with Continuous Recourse
Matthias Claus, R\"udiger Schultz, Kai Sp\"urkel, Tobias Wollenberg

TL;DR
This paper introduces mean-risk models for stochastic semidefinite programs with continuous recourse, analyzing their structural properties and stability under distribution perturbations, with implications for computational approaches.
Contribution
It develops a framework for risk-averse stochastic SDPs, exploring convexity, continuity, and stability, and presents extended formulations for finite discrete distributions.
Findings
Mean-risk models exhibit convexity and Lipschitz continuity.
Extended formulations lead to deterministic mixed-integer SDPs.
Models are stable under distribution perturbations.
Abstract
The vast majority of the literature on stochastic semidefinite programs (stochastic SDPs) with recourse is concerned with risk-neutral models. In this paper, we introduce mean-risk models for stochastic SDPs and study structural properties as convexity and (Lipschitz) continuity. Special emphasis is placed on stability with respect to changes of the underlying probability distribution. Perturbations of the true distribution may arise from incomplete information or working with (finite discrete) approximations for the sake of computational efficiency. We discuss extended formulations for stochastic SDPs under finite discrete distributions, which turn out to be deterministic (mixed-integer) SDPs that are (almost) block-structured for many popular risk measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
∎
11institutetext: M. Claus 22institutetext: R. Schultz 33institutetext: K. Spürkel 44institutetext: T. Wollenberg 55institutetext: University Duisburg-Essen
Thea-Leymann-Straße 9
D-45127 Essen
Tel.: +49 201 183 6887
55email: [email protected]
On Risk-Averse Stochastic Semidefinite Programs with Continuous Recourse
††thanks: The authors gratefully acknowledge the support of the German Research Foundation (DFG) within the collaborative research center TRR 154 “Mathematical Modeling, Simulation and Optimization Using the Example of Gas Networks”.
Matthias Claus
Rüdiger Schultz
Kai Spürkel
Tobias Wollenberg
(Received: date / Accepted: date)
Abstract
The vast majority of the literature on stochastic semidefinite programs (stochastic SDPs) with recourse is concerned with risk-neutral models. In this paper, we introduce mean-risk models for stochastic SDPs and study structural properties as convexity and (Lipschitz) continuity. Special emphasis is placed on stability with respect to changes of the underlying probability distribution. Perturbations of the true distribution may arise from incomplete information or working with (finite discrete) approximations for the sake of computational efficiency. We discuss extended formulations for stochastic SDPs under finite discrete distributions, which turn out to be deterministic (mixed-integer) SDPs that are (almost) block-structured for many popular risk measures.
Keywords:
Stochastic Semidefinite Programming Mean-Risk Models Stability Analysis Extended Formulations
1 Introduction
Stochastic semidefinite programs with recourse were first considered by Ariyawansa and Zhu in AriyawansaZhu2006 , where, for finite discrete distributions, the authors reformulate the risk-neutral stochastic SDP as a block-structured deterministic SDP and discuss an application to the stochastic version of the minimum-volume covering ellipsoid problem (cf. SunFreund2004 , VandenbergheBoyd1996 ). In ZhuAriyawansa2011 , the same authors give a multitude of other applications, including problems in geometry, location aided routing, RC circuit design and structural optimization.
Some approaches to the algorithmic treatment of risk neutral programs with linear recourse carry over to expectation based stochastic SDPs. Extending the results of Zhao (cf. Zhao2001 ), Mehrotra and Özevin derive a polynomial logarithmic barrier algorithm employing Bender’s decomposition (cf. MehrotraOezevin2007 ). Using the volumetric barrier of Vaidya (cf. Vaidya1996 ), Ariyawansa and Zhu construct algorithms of similar complexity in AriyawansaZhu2011 . Furthermore, in JinAriyawansaZhu2012 , Jin, Ariyawansa and Zhu propose homogeneous self-dual algorithms with complexities comparable to the ones of the methods mentioned before. Motivated by an application in multi-antenna wireless networks, Gaujal and Mertikopoulos establish a stochastic approximation algorithm in GaujalMertikopoulos2016 .
Chance constrained SDP models have been introduced by Ariyawansa and Zhu in (Zhu2006, , Chapter 3), where an application to the stochastic minimum-volume covering ellipsoid problem is considered. A different approach towards risk-aversion is taken by Schultz and Wollenberg, who consider stochastic mixed-integer semidefinite programs arising from unit commitment problems in AC transmission systems. Based on Lagrangian relaxation of the nonanticipativity constraint, a decomposition algorithm for minimizing a weighted sum of the expectation and the probability of exceeding a certain threshold is proposed in SchultzWollenberg2017 .
The present work extends the models of SchultzWollenberg2017 and AriyawansaZhu2011 by considering more general risk measures. Instead of focussing on a certain application, we discuss structural properties as convexity and (Lipschitz) continuity of the resulting objective functions. Consequences for quantitative stability of the stochastic SDP models under perturbations of the underlying distribution are pointed out. Such perturbations may arise from incomplete information about the distribution or the choice to work with a simpler (possibly finite discrete) approximation for reasons of computational efficiency.
Furthermore, we establish sufficient conditions for differentiabiliy in the risk neutral setting. Finally, for finite discrete distributions, we establish equivalent SDPs for various risk measures and give indications on how to exploit their special structure for numerical treatment.
2 Two-Stage Stochastic SDPs with Continuous Recourse
Let denote the cone of symmetric positive semidefinite matrices in . The componentwise Frobenius product of and is defined as A\bullet x:=\big{(}\mathrm{tr}(a_{1}x),\ldots,\mathrm{tr}(a_{s}x)\big{)}^{\top}\in\mathbb{R}^{s}. Furthermore, the Frobenius norm on is given by .
We shall consider the parametric SDP
[TABLE]
where enters as a parameter. The data is comprised of , , , and a nonempty, closed, convex set . The set is usually given as a spectrahedron, i.e. the intersection of the solution sets of a finite number of affine matrix inequalities with the cone of positive semidefinite matrices.
Let be the realization of a random vector on some probability space . A two-stage stochastic SDP arises from (P()) if the decision has to be taken without knowledge of the particular realization , while can be chosen after observing the previously unknown parameter. In this setting, the optimal decision is governed by the recourse problem
[TABLE]
Let denote the optimal value function of (1) with respect to the right-hand side of the system of matrix equations in its constraints, i.e.
[TABLE]
Introducing the function , we may rewrite (P() as
[TABLE]
Due to the assumed interplay between decision and observation, problem (2) is not well-defined without further modelling choices. For any , belongs to the space of extended real-valued random variables on the underlying probability space. We thus may fix any functional satisfying
[TABLE]
and consider the optimization problem
[TABLE]
where the mapping is given by .
We shall work with the following assumptions:
- A1
(Complete recourse) .
- A2
(Strict dual feasibility) There is some such that is positive definite.
Similar, yet more restrictive assumptions are also made in MehrotraOezevin2007 .
Lemma 1
Assume A2, then A1 holds if and only if is compact.
Proof
is closed due to the closedness of . Suppose that is unbounded, i.e. that there exists a sequence with . Define , then holds for all . Therefore, the sequence can be assumed to converge to some without loss of generality. By we have for all . Thus,
[TABLE]
Now select any . Then holds for any and we have
[TABLE]
verifying . By duality, the set has to be empty, which contradicts A1.
Let be compact, then once again by duality for arbitrary , there exists with , which implies and thus A1. ∎
The lemma above shows that is attained for any whenever A1 and A2 hold true.
Lemma 2
Assume A1 and A2, then is finite, convex and Lipschitz continuous on .
Proof
Due to A1 and A2, strong duality holds true for the SDP defining . We thus have
[TABLE]
As is nonempty and compact by Lemma 1, is finite on .
Furthermore, for arbitrary and , strong duality implies
[TABLE]
which proves the asserted convexity of .
To establish Lipschitz continuity, let be arbitrary and fixed. Then by strong duality and the compactness of , there exists such that and . By and we have
[TABLE]
and thus . Set , then
[TABLE]
holds for all , which completes the proof. ∎
Remark 1
Under assumptions A1 and A2, is finite and convex, which implies directional differentiability by (Rockafellar1970, , Theorem 25.4). Furthermore, the subdifferential of is convex, compact and admits the representation
[TABLE]
By (Rockafellar1970, , Theorem 25.1), is differentiable at if and only if is a singleton. In that case, we have .
Remark 2
In two-stage stochastic linear programming, the counterpart of is the optimal value function of a linear program:
[TABLE]
with and . By linear programming theory, is finite on iff and . In this situation, admits the representation
[TABLE]
where denote the vertices of the polytope . In particular, is piecewise linear, convex and Lipschitz continuous.
The following example shows that the assumptions A1 and are not sufficient to ensure that the optimal value in the problem defining is attained for all .
Example 1
For , consider the SDP
[TABLE]
For any we have
[TABLE]
Consequently, A1 is fulfilled. Moreover, we have
[TABLE]
As (4) is strictly feasible for any right-hand side , strong duality holds and (5) implies that the infimum of (4) is zero. Furthermore, for any we have
[TABLE]
which yields the lower bound for any that is feasible for (4). Consequently, the optimal value in (4) is not attained if .
3 Structure of Risk-Averse Stochastic SDPs
Let us now return to problem (3) and consider various choices of . To ensure finiteness, we shall work with moment conditions on the Borel probability measure induced by the underlying random vector . Let denote the space of all Borel probability measures on and
[TABLE]
be the subspace of measures having finite moments of order .
Lemma 3
Assume A1, A2 and . Then for all and the mapping , is convex and Lipschitz continuous with constant .
Proof
For any we have
[TABLE]
by Lemma 2.
For any , and , the convexity of yields
[TABLE]
and thus in particular with respect to the -almost sure partial order, proving the asserted convexity of .
Finally,
[TABLE]
holds for all . ∎
Definition 1
A mapping defined on some linear subspace of containing the constants is called a convex risk measure if the following conditions are fulfilled:
(Convexity) For any and we have
[TABLE] 2. 2.
(Monotonicity) for all satisfying with respect to the -almost sure partial order. 3. 3.
(Translation equivariance) for all and .
A convex risk measure is coherent if the following holds true:
(Positive homogeneity) for all and .
Definition 2
A mapping is called law-invariant if for all with we have .
We shall give some examples of risk-measures frequently used in stochastic programming as listed in RuszczynskiShapiro2003 , pp. 447-448, and ShapiroDentchevaRuszczynski2009 . Later we will give extensive formulations of discrete mean-risk SDPs based on these risk-measures:
- (i)
The expectation is a law-invariant coherent risk-measure.
- (ii)
The expected excess over threshold (as used in SchultzTiedemann2006 ) is the mapping defined by
[TABLE]
This is a non-decreasing, convex and law-invariant risk measure, but in general not translation-equivariant.
- (iii)
The conditional value-at-risk at level
[TABLE]
is law-invariant and coherent (cf. Pflug2000 ).
- (iv)
The value-at-risk at level
[TABLE]
is nondecreasing, law-invariant, translation-equivariant and positively homogenous, but in general non-convex.
- (v)
The upper semi-deviation of order is the mapping \text{\mathbb{M}ad}^{+}_{p}:L^{p}(\Omega,\mathcal{F},\mathbb{P})\rightarrow\mathbb{R} defined by
[TABLE]
For this gives rise to the law-invariant and coherent risk measure (cf. ShapiroDentchevaRuszczynski2009 , p. 276).
Proposition 1
Assume A1 and A2, let be a convex subset of that contains and fix a convex and nondecreasing mapping . Then is finite and convex on . In particular, problem (3) is convex.
Proof
Finiteness of follows directly from the finiteness of . Furthermore, for any and we have
[TABLE]
The first inequality above holds due to the monotonicity of and the convexity of (by Lemma 3), while the second one is justified by the convexity of . ∎
Proposition 2
Assume A1, A2 and that the support of is bounded. Furthermore, let be a coherent risk measure and assume that there is some such that . Then is finite and Lipschitz continuous with constant on .
Proof
is finite and Lipschitz continuous with constant with respect to the -norm on by by (FoellmerSchied2004, , Lemma 4.3).
For any , the mapping is continuous by Lemma 2, which implies
[TABLE]
Thus, , which implies the asserted finiteness of .
Furthermore, for any , we have
[TABLE]
by Lemma 3. ∎
If the support of is unbounded, may fail to be a subset of . While Lipschitz continuity with respect to any -norm with does not hold for general coherent risk measures, the Conditional Value-at-Risk is known to be Lipschitz continuous with respect to the -norm with constant (cf. (Pichler2017, , Corollary 3.7)). Using the Kusuoka representation (cf. Kusuoka2001 ), this allows to replace the boundedness of the support of with a less restrictive assumption on the moments of for special classes of risk measures.
Definition 3
Random variables and are called comonotonic if is distributionally equivalent to where is uniformly distributed on .
A coherent risk measure is said to be comonotonic if for any two comonotonic random variables we have .
For a discussion of comonotonicity we refer to DhaeneEtAl2002 and DhaeneEtAl2006 . A proof of the following result is given in (Shapiro2013, , Theorem 2):
Theorem 3.1
*A law-invariant coherent risk measure with
is comonotonic if and only if there exists probability measure on such that*
[TABLE]
holds for all . Furthermore, the measure in representation (7) is defined uniquely.
Example 2
Using to denote the Dirac measure at
[TABLE]
and, in particular,
[TABLE]
hold for all .
Proposition 3
Let with be a law-invariant, comonotonic coherent risk measure. Assume A1, A2, and
[TABLE]
where denotes the uniquely defined probability measure form representation (7). Then is Lipschitz continuous with constant on .
Proof
For any , we have
[TABLE]
The second inequality above holds due to (Pichler2017, , Corollary 3.7), while the third one is justified by Lemma 3. ∎
We shall now study the dependence of on the underlying probability measure . This is motivated by the fact that in applications the true probability distribution of the random parameter may be unknown. In such situations, one may work with an approximation if the optimal value function and the optimal solution set mapping of (3) are at least semicontinuous with respect to changes of the underlying distribution.
Let be an atomless probability space, i.e. assume that for any with there exists some with and , and fix any . Then for any there exists some such that . Thus, given any law-invariant mapping , the function
[TABLE]
is well-defined. Furthermore, we can construct a mapping by setting . To ease the notation, we shall assume that itself is atomless. Given any law-invariant mapping , we shall consider the function
[TABLE]
For the following analysis, we equip the space with the topology of weak convergence, where a sequence converges to some , written if and only if
[TABLE]
holds for any bounded and continuous function . It is well known that even for linear recourse one cannot expect weak continuity of on the entire space . Along the lines of ClausKraetschmerSchultz2017 , we shall thus restrict the analysis to appropriate subspaces.
Definition 4
A set is called locally uniformly -integrating if for any and any there exists some open neighborhood of with respect to the topology of weak convergence such that
[TABLE]
Example 3
(a) For any and , the set
[TABLE]
of measures having uniformly bounded moments of order is locally uniformly -integrating (cf. (Claus2016, , Lemma 2.69)).
(b) For any and compact set , the set
[TABLE]
of measures with support in is locally uniformly -integrating by (KraetschmerSchiedZaehle2017, , Lemma 5.1).
(c) Any singleton is locally uniformly -integrating for any by (KraetschmerSchiedZaehle2017, , Lemma 5.2).
Theorem 3.2
Let with be law-invariant, convex and nondecreasing. Assume A1 and A2 and let be locally uniformly -integrating. Then the following statements hold true:
The restriction of to the set is continuous with respect to the product topology of the the standard topology on and the relative topology of weak convergence on . 2. 2.
The optimal value function
[TABLE]
is weakly upper semicontinuous.
Additionally assume that is compact. Then
* is weakly continuous.* 2. 4.
The optimal solution set mapping
[TABLE]
is weakly upper semicontinuous in the sense of Berge, i.e. for any and any open set with there exists a weakly open neighborhood of such that for all . Furthermore, is nonempty and compact for any .
Proof
Invoking Lemma 2, the result follows from (ClausKraetschmerSchultz2017, , Corollary 2). ∎
Corollary 1
Let with be law-invariant, convex and nondecreasing and assume A1 and A2. Then is continuous.
Proof
By part (c) of Example 3 we may apply the first part of Theorem 3.2 to . The asserted continuity follows from for any . ∎
We shall now turn our attention to questions of differentiability, but confine the analysis to the risk neutral model.
Lemma 4
Assume A1, A2 and , then the functional , is directionally differentiable and
[TABLE]
holds for all .
Proof
is finite valued by Lemma 3, convex by Proposition 1 and thus directionally differentiable (cf. (Rockafellar1970, , Theorem 25.4)). Furthermore, is a pointwise limit of measurable functions and thus measurable for any . The asserted representation of the directional derivative is justified by Lemma 2 and (Bertsekas1973, , Proposition 2.1). ∎
Sufficient conditions for differentiability can be obtained using the same arguments as for linear recourse (cf. ShapiroDentchevaRuszczynski2009 ).
Lemma 5
Assume A1, A2 and and let be such that
[TABLE]
is a singleton for -almost all . Then is differentiable at .
Proof
For -almost all , , is differentiable with measurable derivative
[TABLE]
Consider the functions defined by
[TABLE]
then holds for -almost all . Furthermore, Lemma 2 implies for all and . Hence, by Lebesgue’s dominated convergence theorem, we have
[TABLE]
Consequently, is differentiable at and . ∎
Corollary 2
Assume A1, A2 and that is absolutely continuous with respect to the Lebesgue measure. Then is continuously differentiable on .
Proof
Let denote the set of points of nondifferentiability of . By (Rockafellar1970, , Theorem 25.5),
[TABLE]
is a null set with respect to the Lebesgue measure for any , which implies . Consequently, is differentiable on . Continuity of the derivative follows from (Rockafellar1970, , Theorem 25.5) and the convexity of . ∎
Remark 3
Assuming A1, A2 and , the subdifferential of admits the representation
[TABLE]
Furhter details are given in Bertsekas1973 .
Corollary 3
Assume A2 and that the underlying random variable follows a finite discrete distribution with realizations and respective probabilities . Furthermore, assume that is nonempty for any and . Then
[TABLE]
holds for any .
Proof
The result follows directly from (Rockafellar1970, , Theorem 23.8). ∎
4 Extensive Formulations for Finite Discrete Distributions
Throughout this section, we shall assume A1, A2 and that the underlying random variable follows a finite discrete distribution with realizations and respective probabilities . Furthermore, we denote the index set by .
It is well known that in the risk neutral setting, the stochastic SDP admits a reformulation as a block-structured SDP (cf. AriyawansaZhu2006 , MehrotraOezevin2007 ):
Proposition 4
The risk neutral stochastic SDP
[TABLE]
is equivalent to the SDP
[TABLE]
in the sense that the infimal values of the problems coincide. Furthermore, is an optimal solution for (8) if and only if there exist and such that is an optimal solution for (9).
Proof
By definition of ,
[TABLE]
holds for any , satisfying for all . Thus, the infimal value of (8) is less or equal to the infimal value of (9). Furhtermore, (10) is satisfied as equality if and only if
[TABLE]
holds for all . The optimal solution set above is nonempty by strong duality, which holds due to A1 and A2. ∎
We continue with extensive formulations of the SDP (3) for mean-risk models based on the risk measures immediately following Definition 2. In this context, shall always be a nonnegative, predefined parameter indicating risk-aversion in the optimization.
Proposition 5
[TABLE]
with as a given parameter, can be equivalently restated as
[TABLE]
Proof
As the objective function of (12) is increasing with respect to , any optimal solution satisfies for all . The asserted equivalence of (11) and (12) then follows as in the proof of Proposition 4. ∎
Proposition 6
[TABLE]
can be equivalently restated as
[TABLE]
Proof
This follows directly from the variational representation of in (6). The expected-excess can be pushed into the restrictions by the same trick as in Proposition 5. ∎
As in in the risk-neutral case, problems (12) and (13) exhibit a block structure, i.e. there is no coupling constraint involving variables associated with different scenarios. This allows for a direct adaptation of the decomposition algorithms established for the expectation based model.
Proposition 7
Consider the problem
[TABLE]
with compact set . This problem can be equivalently restated as the following SDP with binary variables
[TABLE]
if is chosen sufficiently big.
Proof
As in the preceding propositions introduce a dummy variable to push into the restrictions as and minimize over . Note that is equivalent to
[TABLE]
As for given feasible points to the second stage problem corresponding to realization are denoted as , (15) can be rewritten as
[TABLE]
This conditional summation can in turn be cast into inequalities with
binary variables , ,
[TABLE]
if is chosen such that for all feasible and all close to . Since the existence of follows from compactness of , as for all . ∎
Unlike the previous models, (14) does not decompose scenariowise due to the coupling constraint , which involves variables from all scenarios. Furthermore, it has an additional binary variable for each scenario. Problems of a similar structure have been considered in the context of minimizing a weighted sum of the expectation and the probability of exceeding a fixed threshold in SchultzWollenberg2017 , where Lagrangian relaxation of the coupling constraint enables an approach based on Bender’s decomposition. This direction seems also very promising for the algorithmic treatment of (14).
Proposition 8
[TABLE]
can be equivalently restated as
[TABLE]
Proof
Analogous to Proposition 5. ∎
Unlike (14), the equivalent SDP in Proposition 8 contains an individual coupling constraint for each scenario. While Lagrangian relaxation still is possible, it remains to be examined whether this approach is sensible form a computational point of view.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) K. A. Ariyawansa, Y. Zhu, Stochastic semidefinite programming: a new paradigm for stochastic optimization , 4OR, 4(3), pp. 239-253 (2006)
- 2(2) K. A. Ariyawansa, Y. Zhu, A class of polynomial volumetric barrier decomposition algorithms for stochastic semidefinite programming , Mathematics of Computation, 80, no. 275, pp.1639-1661 (2011)
- 3(3) D. P. Bertsekas, Stochastic optimization problems with nondifferentiable cost functionals , Journal of Optimization Theory and Applications, 12, pp. 218-231 (1973)
- 4(4) M. Claus, Advancing stability analysis of mean-risk stochastic programs : bilevel and two-stage models , Ph D thesis, University of Duisburg-Essen (2016)
- 5(5) M. Claus, V. Krätschmer and R. Schultz, Weak continuity of risk functionals with applications to stochastic programming , SIAM Journal on Optimization, 27(1), pp. 91-108 (2017)
- 6(6) J. Dhaene, M. Denuit, M. J. Goovaerts, R. Kaas, D. Vyncke, The concept of comonotonicity in actuarial science and finance: theory , Insurance: Math. Econom., 31, pp. 3-33 (2002)
- 7(7) J. Dhaene, S. Vanduffel, M. J. Goovaerts, R. Kaas, Q. Tang, D. Vyncke, Risk Measures and Comonotonicity: A Review , Stochastic Models, 22, pp. 573-606 (2006)
- 8(8) H. Föllmer, A. Schied, Stochastic Finance: An Introduction in Discrete Time , 2nd ed., de Gruyter Stud. Math. 27, de Gruyter, Berlin (2004)
