Two-stage Stochastic Programming under Multivariate Risk Constraints   with an Application to Humanitarian Relief Network Design

Nilay Noyan; Merve Merakli; Simge Kucukyavuz

arXiv:1701.06096·math.OC·June 2, 2020·Math. Program.

Two-stage Stochastic Programming under Multivariate Risk Constraints with an Application to Humanitarian Relief Network Design

Nilay Noyan, Merve Merakli, Simge Kucukyavuz

PDF

TL;DR

This paper develops a novel decomposition method for two-stage stochastic programs with multivariate risk constraints, applied to humanitarian logistics, demonstrating scalability and effectiveness in hurricane threat scenarios.

Contribution

It introduces an exact unified decomposition framework for multivariate risk-constrained stochastic programs, addressing computational challenges and demonstrating application in humanitarian relief network design.

Findings

01

The proposed algorithm is computationally scalable for large-scale problems.

02

Application to humanitarian logistics under hurricane threat shows practical effectiveness.

03

The method handles complex multivariate stochastic constraints that traditional approaches cannot.

Abstract

In this study, we consider two classes of multicriteria two-stage stochastic programs in finite probability spaces with multivariate risk constraints. The first-stage problem features a multivariate stochastic benchmarking constraint based on a vector-valued random variable representing multiple and possibly conflicting stochastic performance measures associated with the second-stage decisions. In particular, the aim is to ensure that the associated random outcome vector of interest is preferable to a specified benchmark with respect to the multivariate polyhedral conditional value-at-risk (CVaR) or a multivariate stochastic order relation. In this case, the classical decomposition methods cannot be used directly due to the complicating multivariate stochastic benchmarking constraints. We propose an exact unified decomposition framework for solving these two classes of optimization…

Tables1

Table 1. Table 1: Performance of the proposed delayed cut generation algorithms ( | I | = | J | = 30 𝐼 𝐽 30 |I|=|J|=30 )

		DCG-DEF				DCG-SD
		Time (s)		Number		Time (s)		Number
$α$	# Sce.	Total / [%gap]	Sep.	Sep.	$L$	Total / [%gap]	Sep.	Cuts	Sep.	$L$
	200	704.9	0.2	4.3	2.3	46.8	3.1	6101.3	62.0	3.0
	300	1483.5	0.7	5.3	3.0	108.7	5.1	9697.3	78.0	3.3
	400	2622.4	1.2	5.3	3.7	351.5	13.9	18569.7	122.0	5.3
0.99	500	3071.3	0.8	4.0	3.0	363.3	17.7	20805.0	105.3	4.3
	600	${[0.2]}^{†}^{* *}$	-	-	-	662.1	42.4	27068.7	157.0	5.0
	800	^∗∗∗	-	-	-	987.2	51.7	31994.7	129.7	4.7
	1000	^∗∗∗	-	-	-	1594.2	110.3	43934.0	129.3	5.3
	1500	^∗∗∗	-	-	-	3450.6 ${[0.4]}^{† †}$	506.0	62436.0	172.0	4.0
	200	408.9	0.7	2.7	0.7	148.6	17.4	9524.7	137.0	4.0
	300	1476.4	1.6	3.3	1.0	277.6	26.1	13280.0	128.7	4.0
	400	3392.7	2.1	3.7	1.7	536.8	45.2	20633.3	162.0	4.0
0.95	500	2753.2 ${[0.2]}^{†}$	1.4	2.5	1.5	671.9	150.6	23045.0	161.3	3.3
	600	${[0.1]}^{† †}^{*}$	-	-	-	878.0	156.2	26151.3	141.0	3.3
	800	^∗∗∗	-	-	-	1616.4	126.4	33607.7	115.7	3.0
	1000	^∗∗∗	-	-	-	1857.7	274.6	42229.3	140.3	3.3
	1500	^∗∗∗	-	-	-	3390.1 ${[1.4]}^{† †}$	867.2	52294.0	120.0	2.0
	200	397.4	1.0	2.7	0.7	303.5	36.5	11781.0	180.0	4.3
0.90	300	1128.5	0.6	3.0	0.7	352.5	48.7	13152.7	111.3	3.0
	400	2263.0	1.6	4.0	1.3	623.9	89.3	19482.3	130.3	3.3
	500	1660.2 ${[0.3]}^{†}$	2.8	2.0	1.0	1392.1	228.6	25358.0	150.3	3.3
	600	2264.3 ${[0.1]}^{† †}$	2.2	2.0	1.0	857.5 ${[7.9]}^{†}$	229.6	30051.0	153.5	3.0
$†$ :	Each dagger sign indicates one instance hitting the time limit with an integer feasible solution.
$*$ :	Each asterisk sign indicates one instance hitting the time limit with no integer feasible solution.

Equations134

min

min

CVaR_{α} (c^{⊤} G (x, ξ (ω))) \leq CVaR_{α} (c^{⊤} Z), \forall c \in C,

x \in X .

Q (x, ξ (ω_{s})) =

Q (x, ξ (ω_{s})) =

min

min

CVaR_{α} (c^{⊤} \hat{G} (x, y)) \leq CVaR_{α} (c^{⊤} Z), \forall c \in C,

x \in X,

Q (x, ξ (ω_{s})) = q_{s}^{⊤} y (ω_{s}), \forall s \in S,

y (ω_{s}) \in y \in Y (x, ξ (ω_{s})) arg min {q_{s}^{⊤} y}, \forall s \in S,

\displaystyle\min\quad\bigg{\{}\eta_{l}+\dfrac{1}{1-\alpha}\sum_{s\in S}p_{s}w_{sl}~{}:~{}

\displaystyle\min\quad\bigg{\{}\eta_{l}+\dfrac{1}{1-\alpha}\sum_{s\in S}p_{s}w_{sl}~{}:~{}

\displaystyle\forall~{}s\in S,~{}w_{sl}\geq 0\quad\forall~{}s\in S,\ \eta_{l}\in\mathbbm{R}\bigg{\}}.

min

min

η_{l} + \frac{1}{1 - α} s \in S \sum p_{s} w_{s l} \leq CVaR_{α} (\hat{c}_{(l)}^{⊤} Z), \forall l = 1, \dots, \overset{ˉ}{L},

w_{s l} \geq \hat{c}_{(l)}^{⊤} \hat{g}_{s} (x, y_{s}) - η_{l}, \forall s \in S, l = 1, \dots, \overset{ˉ}{L},

w_{s l} \geq 0, \forall s \in S, l = 1, \dots, \overset{ˉ}{L},

x \in X, η \in \mathbbm R^{\overset{ˉ}{L}},

T_{s} x + W_{s} y_{s} \geq h_{s}, \forall s \in S,

y_{s} \in \mathbbm R_{+}^{n_{2}}, \forall s \in S,

CVaR_{α} (\hat{c}_{(l)}^{⊤} G (x, ξ (ω))) \leq CVaR_{α} (\hat{c}_{(l)}^{⊤} Z), \forall l = 1, \dots, \overset{ˉ}{L} .

CVaR_{α} (\hat{c}_{(l)}^{⊤} G (x, ξ (ω))) \leq CVaR_{α} (\hat{c}_{(l)}^{⊤} Z), \forall l = 1, \dots, \overset{ˉ}{L} .

min

min

\eqref D E F F or m - c v a r - co n s t 1, \eqref D E F F or m - c v a r - co n s t 3 - \eqref D E F F or m - c v a r - co n s t 4,

\tilde{Q} (x, η, w, ξ (ω_{s})) = min

\tilde{Q} (x, η, w, ξ (ω_{s})) = min

w_{s l} \geq \hat{c}_{(l)}^{⊤} \hat{g}_{s} (x, y) - η_{l}, \forall l = 1, \dots, \overset{ˉ}{L},

T_{s} x + W_{s} y \geq h_{s}, y \in \mathbbm R_{+}^{n_{2}} .

c \in C max

c \in C max

(MP^{L}) min

(MP^{L}) min

(x, η, w, θ) \in O,

(x, η, w) \in F,

η_{l} + \frac{1}{1 - α} s \in S \sum p_{s} w_{s l} \leq CVaR_{α} (\tilde{c}_{(l)}^{⊤} Z), \forall l = 1, \dots, L,

\eqref D E F F or m - c v a r - co n s t 3 - \eqref D E F F or m - c v a r - co n s t 4 (with \overset{ˉ}{L} is replaced by L) .

(PS_{s}^{L}) \overset{ˉ}{Q}_{s}^{L} := min

(PS_{s}^{L}) \overset{ˉ}{Q}_{s}^{L} := min

- \tilde{c}_{(l)}^{⊤} \tilde{g}_{s} y \geq \tilde{c}_{(l)}^{⊤} \overset{ˉ}{g}_{s} \overset{ˉ}{x} - \overset{η}{ˉ}_{l} - \overset{w}{ˉ}_{s l}, \forall l = 1, \dots, L,

W_{s} y \geq h_{s} - T_{s} \overset{ˉ}{x},

y \in \mathbbm R_{+}^{n_{2}},

0 \geq γ_{s}^{⊤} (h_{s} - T_{s} x) + l = 1 \sum L β_{s l}^{⊤} (\tilde{c}_{(l)}^{⊤} \overset{ˉ}{g}_{s} x - η_{l} - w_{s l}) .

0 \geq γ_{s}^{⊤} (h_{s} - T_{s} x) + l = 1 \sum L β_{s l}^{⊤} (\tilde{c}_{(l)}^{⊤} \overset{ˉ}{g}_{s} x - η_{l} - w_{s l}) .

θ_{s} \geq γ_{s}^{⊤} (h_{s} - T_{s} x) + l = 1 \sum L β_{s l}^{⊤} (\tilde{c}_{(l)}^{⊤} \overset{ˉ}{g}_{s} x - η_{l} - w_{s l}) .

θ_{s} \geq γ_{s}^{⊤} (h_{s} - T_{s} x) + l = 1 \sum L β_{s l}^{⊤} (\tilde{c}_{(l)}^{⊤} \overset{ˉ}{g}_{s} x - η_{l} - w_{s l}) .

\mathbbm E ([c^{⊤} X - η]_{+}) \leq \mathbbm E ([c^{⊤} Z - η]_{+}), \forall η \in \mathbbm R, c \in C .

\mathbbm E ([c^{⊤} X - η]_{+}) \leq \mathbbm E ([c^{⊤} Z - η]_{+}), \forall η \in \mathbbm R, c \in C .

\mathbbm E ([c^{⊤} X - c^{⊤} z_{t}]_{+}) \leq \mathbbm E ([c^{⊤} Z - c^{⊤} z_{t}]_{+}), \forall t \in T, c \in C .

\mathbbm E ([c^{⊤} X - c^{⊤} z_{t}]_{+}) \leq \mathbbm E ([c^{⊤} Z - c^{⊤} z_{t}]_{+}), \forall t \in T, c \in C .

min

min

s \in S \sum p_{s} w_{s l t} \leq i \in T \sum \tilde{p}_{i} [\hat{c}_{(l)}^{⊤} z_{i} - \hat{c}_{(l)}^{⊤} z_{t}]_{+}, \forall t \in T, l = 1, \dots, \overset{ˉ}{L},

w_{s l t} \geq \hat{c}_{(l)}^{⊤} \hat{g}_{s} (x, y_{s}) - \hat{c}_{(l)}^{⊤} z_{t}, \forall s \in S, t \in T, l = 1, \dots, \overset{ˉ}{L},

w_{s l t} \geq 0, \forall s \in S, t \in T, l = 1, \dots, \overset{ˉ}{L},

x \in X, \eqref e q : d e f - co u pl e - \eqref e q : d e f - co u pl e 2 .

c \in C min

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Two-stage Stochastic Programming under Multivariate Risk Constraints with an Application to Humanitarian Relief Network Design

@addressi

@addressii

@addressiii

@addressiv

@addressv

Abstract

In this study, we consider two classes of multicriteria two-stage stochastic programs in finite probability spaces with multivariate risk constraints. The first-stage problem features a multivariate stochastic benchmarking constraint based on a vector-valued random variable representing multiple and possibly conflicting stochastic performance measures associated with the second-stage decisions. In particular, the aim is to ensure that the associated random outcome vector of interest is preferable to a specified benchmark with respect to the multivariate polyhedral conditional value-at-risk (CVaR) or a multivariate stochastic order relation. In this case, the classical decomposition methods cannot be used directly due to the complicating multivariate stochastic benchmarking constraints. We propose an exact unified decomposition framework for solving these two classes of optimization problems and show its finite convergence. We apply the proposed approach to a stochastic network design problem in a pre-disaster humanitarian logistics context and conduct a computational study concerning the threat of hurricanes in the Southeastern part of the United States. Our numerical results on these large-scale problems show that our proposed algorithm is computationally scalable.

January 10, 2017

1 Introduction

Two-stage stochastic programming is a vibrant research area, which provides a natural and widely applicable modeling framework for decision making problems under uncertainty in a large variety of fields. Such models are well-suited for the situations where decisions are made in two stages; the first-stage decisions are made before the uncertainty is revealed, and the second-stage (recourse) decisions are made given the predetermined first-stage decisions and the observed realization of the random parameters. For many practical decision making problems, it is essential to evaluate the decisions according to multiple and possibly conflicting stochastic criteria. Along these lines, we focus on two-stage stochastic programming models involving first-stage decisions leading to uncertain outcomes that can be evaluated according to multiple stochastic performance measures associated with the corresponding second-stage decisions.

In multicriteria stochastic optimization, it is common to employ a weighted-sum approach, which aggregates the specified multiple stochastic objective criteria to obtain a single objective function. Following this mainstream approach, we formulate the second-stage problem as a multiobjective optimization problem and focus on its expected optimal objective value in the first stage. In addition, one may also prefer to impose constraints on some performance measures associated with the second-stage decisions. To this end, we employ stochastic benchmarking preference relations between the vector-valued random outcomes of the second-stage decisions. In particular, in the first class of problems we study, the first-stage problem enforces the multivariate polyhedral CVaR constraint introduced by Noyan and Rudolf (2013). This approach considers a family of linear scalarization functions (a polyhedral set of scalarization weight vectors) and requires all scalarized versions of the decision-based random vector of outcomes to be preferable to the corresponding scalarizations of an existing reference (benchmark) outcome according to the univariate CVaR relation. Furthermore, allowing arbitrary polyhedra as scalarization sets also provides a good balance between flexibility and computational tractability. To the best of our knowledge, our study is a first in developing such a risk-averse two-stage model featuring multivariate risk measure-based constraints. We also note that our proposed modeling framework is not limited to the risk-neutral objective function; it can be easily extended to the case where the objective function features also a risk measure representing the decision makers’ risk preferences regarding the random recourse cost. For acceptable risk-averse objectives, which lead to computationally tractable two-stage stochastic programming models, we refer the reader to Ahmed (2006) and Dupǎcová (2008).

Optimization problems with the multivariate stochastic benchmarking constraints have been receiving increasing attention in the literature. Using multivariate stochastic relations is essential to capture the correlation between the multiple random outcomes. The majority of the existing studies extend a univariate stochastic preference relation, based on second-order stochastic dominance (SSD) or a coherent risk measure such as CVaR, to the multivariate case by considering a family of scalarization functions. While multivariate SSD-constrained problems are more frequently studied (see, e.g., Dentcheva and Ruszczyński, 2009; Homem-de-Mello and Mehrotra, 2009; Hu et al., 2012; Dentcheva and Wolfhagen, 2015; 2016), there is an increasing interest in risk measure-constrained variants (see, e.g., Noyan and Rudolf, 2013; Liu et al., 2015), which provide natural relaxations to overly demanding and conservative SSD-based models. More recently, studies focusing on both multivariate SSD- and risk measure-constrained models also appear in literature (Küçükyavuz and Noyan, 2016; Noyan and Rudolf, 2016). In these existing studies, the scalarization functions are almost exclusively of the linear type with the exception of Noyan and Rudolf (2016) who extend the multivariate risk-constrained models by incorporating a general class of scalarization functions (including a variety of non-linear scalarization functions). Moreover, we point out that the scalarization-based line of research has been dedicated almost entirely to single-stage (static) decision-making problems. The only exception is the work of Dentcheva and Wolfhagen (2016), which introduces a two-stage stochastic optimization problem involving a multivariate SSD (more precisely, referred to as the increasing convex ordering in the context of cost minimization) constraint on a random outcome vector of the second-stage decisions with respect to the unit simplex of scalarization vectors. The authors present two approximate decomposition-based solution algorithms which rely on the Lagrangian relaxation of the multivariate stochastic ordering constraint and show their finite convergence to an $\varepsilon$ -feasible $\varepsilon$ -optimal solution, even if the probability space is not finite; their results and algorithms are adaptations of those presented in their earlier study (Dentcheva and Wolfhagen, 2015) for the single-stage case. We contribute to this line of research by introducing an alternative two-stage model with multivariate CVaR constraints. In addition, we consider the SSD-based counterpart (as in Dentcheva and Wolfhagen, 2015) and provide a new computationally tractable and exact solution algorithm for this problem class. We defer a detailed discussion on the advantages of our solution method compared to the existing approaches for the multivariate SSD-constrained two-stage models to Section 3.

Although, it is not directly related to the multicriteria stochastic optimization problems of our interest, we note that there are a few studies on risk-averse two-stage models with univariate (first- or second-order) stochastic dominance and CVaR-based constraints (see, e.g., Fábián, 2008; Gollmer et al., 2011; Dentcheva and Martinez, 2012). However, they employ the risk constraints to compare scalar-based random variables. Our study extends such risk-averse modeling approaches to the multicriteria case, allowing us to consider additional stochastic criteria associated with the second-stage decisions other than the random recourse cost.

The proposed modeling approach with the multivariate risk constraints is a fairly recent research area, and it has promise to be applied in a wide variety of fields. This approach is particularly well-suited for the field of humanitarian logistics, since incorporating risk is crucial for rarely occurring disaster events (e.g., Noyan, 2012), and considering multiple conflicting performance criteria (such as efficiency and equity) is often essential for the effectiveness of the relief response systems (e.g., Vitoriano et al., 2011; Huang et al., 2012; Gutjahr and Nolz, 2016). Motivated by the significance of the long-term pre-disaster planning, we apply the proposed framework to a stochastic pre-disaster relief network design problem. Hence, our study also contributes to the humanitarian relief literature by introducing a new risk-averse two-stage optimization model, which provides a flexible and tractable way of considering decision makers’ risk preferences based on multiple stochastic criteria.

Next we summarize our contributions and give an outline of our paper. In Section 2, we describe a novel two-stage stochastic program with multivariate CVaR constraints. It is well-known that risk-neutral two-stage stochastic programs are generally hard to solve due to a potentially large number of scenario-dependent recourse decisions. Thus, introducing a multivariate stochastic benchmarking relation, which enforces a collection of risk constraints associated with a scalarization set, further complicates the solution of these optimization models. A common approach in solving such large-scale stochastic optimization models is to employ a Benders-type scenario decomposition approach (e.g., Ruszczyński and Shapiro, 2003). The classical decomposition methods cannot be directly applied to our model due to the complicating risk constraints. Utilizing successive relaxations of the multivariate polyhedral CVaR relation, we develop two types of exact and finitely convergent delayed cut generation solution algorithms; the iteratively generated cuts – common in both algorithms – are associated with the scalarization vectors for which the risk constraints are relaxed. In addition, the second algorithm adapts a scenario decomposition approach that exploits the decomposable structure of CVaR and the second-stage problems; this further decomposition proves to be useful in solving larger problem instances. We also develop strong duality results and optimality conditions under certain linearity assumptions. Despite our focus on the CVaR-based models, in Section 3, we show that our proposed solution methods also apply to the multivariate SSD-constrained two-stage models. In Section 4, we apply the proposed modeling approach to a stochastic pre-disaster relief network design problem and present the corresponding mathematical programming formulations. Section 5 is dedicated to the computational study.

2 Two-Stage Optimization with Multivariate CVaR

Constraints

In this section, we introduce the multicriteria stochastic decision-making framework of interest, and present the proposed multivariate CVaR-constrained two-stage stochastic programming model and the corresponding delayed cut generation solution algorithms.

We consider a finite probability space $(\varOmega,2^{\varOmega},\Pi)$ with $\varOmega=\{\omega_{1},\ldots,\omega_{m}\}$ and $\Pi(\omega_{s})=p_{s}$ . Let $S=\{1,\ldots,m\}$ be the index set of the elementary events (also referred to as scenarios). We assume that a benchmark random outcome vector and a polyhedron of scalarization vectors, each component of which represents the relative importance of each decision criterion, are given (specified by decision makers). Let $\mathbf{Z}$ be the benchmark vector with realizations $\mathbf{z}_{1},\ldots,\mathbf{z}_{|T|}$ and associated probabilities $\tilde{p}_{i},~{}i\in T$ ; it is often constructed from some benchmark decision in which case we have $T=S$ and $\tilde{p}_{i}=p_{i},~{}i\in S$ . The scalarization set $\mathcal{C}$ is naturally assumed to be a subset of the unit simplex without loss of generality. In our setting, smaller values of random variables and risk measures are considered to be preferable. We now introduce the class of two-stage stochastic programming problems with multivariate CVaR constraints, where the general form of the first-stage problem is given by

[TABLE]

Here, $f(\mathbf{x})$ is a convex objective function, $\mathcal{X}\subset\mathbbm{R}_{+}^{\bar{n}_{1}}\times\mathbb{Z}_{+}^{n_{1}-\bar{n}_{1}}$ is a non-empty convex set (such as a polyhedron) of the first-stage decision variables, defined by the deterministic constraints of the problem. $\bm{\xi}(\omega)\subset\mathbbm{R}^{r}$ and $Q(\mathbf{x},\bm{\xi}(\omega_{s}))$ designate the vector of the random input parameters of the second-stage problem and the optimal second-stage objective value under scenario $s$ , respectively. More specifically, introducing the notation $\bm{\xi}(\omega_{s})=(\mathbf{q}(\omega_{s}),T(\omega_{s}),W(\omega_{s}),\\ \mathbf{h}(\omega_{s}))=(\mathbf{q}_{s},T_{s},W_{s},\mathbf{h}_{s})$ ,

[TABLE]

where $\mathcal{Y}(\mathbf{x},\bm{\xi}(\omega_{s}))=\{\mathbf{y}\in\mathbbm{R}_{+}^{n_{2}}:T_{s}\mathbf{x}+W_{s}\mathbf{y}\geq\mathbf{h}_{s}\}$ . The second-stage objective function $\mathbf{q}_{s}^{\top}\mathbf{y}$ could be a weighted sum of multiple objectives as in Dentcheva and Wolfhagen (2016), or a single objective, such as cost, that is compatible with the first-stage objective. In addition, for some mapping $\mathbf{G}:\mathcal{X}\times\mathbbm{R}^{r}\rightarrow\mathbbm{R}^{d}$ , $\mathbf{G}(\mathbf{x},\bm{\xi}(\omega))$ is the decision-based random vector featured in (1b). In particular, $\mathbf{G}(\mathbf{x},\bm{\xi}(\omega_{s}))$ represents the multiple random performance measures of interest associated with the optimal second-stage decisions $\mathbf{y}(\omega_{s})\in\operatorname*{arg\,min}\limits_{\mathbf{y}\in\mathcal{Y}(\mathbf{x},\bm{\xi}(\omega_{s}))}\{\mathbf{q}^{\top}_{s}\mathbf{y}\},~{}s\in S,$ for given $\mathbf{x}$ and $\bm{\xi}(\omega_{s})$ . To highlight this correspondence we introduce the random vector $\hat{\mathbf{G}}(\mathbf{x},\mathbf{y})$ given by $\left[\hat{\mathbf{G}}(\mathbf{x},\mathbf{y})\right](\omega)=\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}(\omega),\omega)=\mathbf{G}(\mathbf{x},\bm{\xi}(\omega))$ . Finally, we let $\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}(\omega_{s}),\omega_{s})=\hat{\mathbf{g}}_{s}(\mathbf{x},\mathbf{y}_{s})=(\mathbf{g}_{s}^{1}(\mathbf{x},\mathbf{y}_{s}),\mathbf{g}_{s}^{2}(\mathbf{x},\mathbf{y}_{s}),\ldots,\mathbf{g}_{s}^{d}(\mathbf{x},\mathbf{y}_{s}))^{\top},~{}s\in S$ , and assume that $\mathbf{g}_{s}^{i}(\mathbf{x},\mathbf{y}_{s})$ is an affine function of $\mathbf{x}$ and $\mathbf{y}_{s}$ such that $\mathbf{g}_{s}^{i}(\mathbf{x},\mathbf{y}_{s})=\mathbf{\bar{g}}_{s}^{i}\mathbf{x}+\mathbf{\tilde{g}}_{s}^{i}\mathbf{y}_{s}$ for all $i\in\{1,\ldots,d\}$ and $s\in S$ . Throughout the paper, we assume that problem (1), if feasible, has a finite objective value, i.e., $f(\mathbf{x}),\mathbbm{E}(Q(\mathbf{x},\bm{\xi}(\omega)))>-\infty$ for all $\mathbf{x}\in\mathcal{X}.$

The multivariate risk constraint (1b) ensures that the decision-based random outcome vector $\mathbf{G}(\mathbf{x},\bm{\xi}(\omega))$ is preferable to the specified benchmark $\mathbf{Z}$ according to the multivariate CVaR relation introduced by Noyan and Rudolf (2013). According to this relation, all scalarized versions of $\mathbf{G}(\mathbf{x},\bm{\xi}(\omega))$ are preferable to the corresponding scalarizations of the benchmark outcome according to the univariate CVaR-based preference relation. We note that using a scalarization-based risk constraint such as (1b) is useful to address ambiguities and inconsistencies in the weight vectors; for detailed discussions we refer to Hu and Mehrotra (2012) and Liu et al. (2015). Moreover, CVaR is a widely applied popular risk measure with several desirable properties; it is a law invariant coherent risk measure (Artzner et al., 1999), serves as a fundamental building block for other coherent risk measures, and can be used to capture a wide range of risk preferences, including risk-neutral and worst-case approaches. In addition, Noyan and Rudolf (2013) highlight that CVaR-based relations provide flexible, meaningful, and computationally tractable relaxations for overly conservative SSD relations. In a broad sense, CVaR can be viewed as a weighted sum of the least favorable outcomes (those exceeding the $\alpha$ -quantile – also known as value-at-risk at confidence level $\alpha$ ).

Using the above notation, the proposed risk-averse two-stage stochastic programming model can alternatively be formulated as follows:

[TABLE]

where for a given $\hat{\mathbf{c}}_{(l)}\in\mathcal{C}$ , $\operatorname{CVaR}_{\alpha}(\hat{\mathbf{c}}_{(l)}^{\top}\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}))$ is calculated as (see Rockafellar and Uryasev, 2000)

[TABLE]

For the risk-neutral version of the proposed two-stage optimization model (1)-(2), which does not feature the risk constraints (1b), it is well-known that the corresponding deterministic equivalent formulation (DEF) can simply be obtained from (3) by dropping the optimality condition on the second-stage decisions, i.e., by replacing (3e) with $\mathbf{y}(\omega_{s})\in\mathcal{Y}(\mathbf{x},\bm{\xi}(\omega_{s})),~{}\forall s\in S$ . DEF provides both the optimal first-stage and second-stage decisions without any issue of the so-called second-stage sub-optimality; this result relies on the interchangeability of the order of expectation and minimization operators associated with the second-stage objective function, and it is known as interchangeability principle (see, e.g., Ruszczyński and Shapiro, 2003). Such a result also exists for the risk-averse two-stage stochastic programming models, where the objective function features a non-decreasing functional of the recourse cost (Takriti and Ahmed, 2004). The next proposition shows that we can similarly drop the optimality condition in (3e) to obtain the DEF of our model even if it features a set of risk constraints.

Proposition 2.1

The following deterministic formulation is equivalent to the two-stage model given by (1)-(2) (and consequently, equivalent to (3)):

[TABLE]

where $\bar{L}$ is a finite integer and $\hat{\mathbf{c}}_{(l)},~{}l=1,\dots,\bar{L},$ are the projections of the vertices of the polyhedron $P=\{(\mathbf{c},\eta,\mathbf{w})\in\mathcal{C}\times\mathbbm{R}\times\mathbbm{R}^{|T|}_{+}:w_{i}\geq\eta-\mathbf{c}^{\top}\mathbf{z}_{i},\ i\in T\}$ .

Proof.

First, observe that it is sufficient to consider a finite number of scalarization vectors from set $\mathcal{C}$ , specifically the projections of the extreme points (given by $\hat{\mathbf{c}}_{(1)},\dots,\hat{\mathbf{c}}_{(\bar{L})}$ ) of the above defined polyhedron $P$ Noyan and Rudolf (2013), since $P$ is only characterized by the given fixed (decision-independent) benchmark outcome vector. Hence, constraints (1b) can be replaced by

[TABLE]

Second, the univariate $\operatorname{CVaR}$ -relations in (6) can be represented by linear inequalities of type (5b)-(5d), ignoring the issue of optimality of the $\mathbf{y}_{s},~{}s\in S,$ decisions for now. This observation is a simple consequence of the LP representation of CVaR in (2), and the fact that $\operatorname{CVaR}_{\alpha}(\mathbf{c}^{\top}\mathbf{Z})$ is a known constant given a vector $\mathbf{c}$ and the benchmark $\mathbf{Z}$ . It is also easy to see that CVaR constraints (5c) can be decomposed over scenarios. This allows us to consider some modified versions of the first- and second-stage problems given in (1)-(2); in particular, $w_{sl}$ and $\eta_{l}$ , for $s\in S,l=1,\dots,\bar{L}$ are considered as first-stage decision variables and (1b) is replaced by (5b) and (5d), whereas the inequalities associated with $s\in S$ in (5c) are added to the constraints of the second-stage problem in (2). Solving (2) ensures that $\mathbf{y}_{s}$ featured in (5c) satisfies the optimality condition in (3e). As a result, we obtain the following model which is equivalent to (1)-(2):

[TABLE]

Thus, we can reformulate the proposed risk-averse model (1)-(2) as a risk-neutral one (as seen from (7)-(8)). Then, by the well-known interchangeability principle of risk-neutral two-stage models, the assertion directly follows.

∎

Remark 2.1

Mean-Risk Objective Function. The proposed model can be easily extended to the case where the random objective outcomes are compared according to a mean-risk functional representing the decision makers’ risk preferences. In particular, $\mathbbm{E}(Q(\mathbf{x},\bm{\xi}(\omega))$ can be replaced by $\mathbbm{E}(Q(\mathbf{x},\bm{\xi}(\omega))+\lambda\rho(Q(\mathbf{x},\bm{\xi}(\omega))$ in (1), where $\rho$ is a risk functional such as $\operatorname{CVaR}_{\alpha}$ and $\lambda$ is a non-negative trade-off coefficient representing the exchange rate of mean cost for risk. When $\rho$ is a non-decreasing function and $\lambda\geq 0$ , the mean-risk function preserves the convexity, and consequently, slight variants of our cutting plane solution algorithms, which we will describe in the next section, are applicable (for similar developments see, e.g., Ahmed, 2006; Noyan, 2012). Thus, our proposed modeling and solution framework is not limited to the risk-neutral objective function.

In the appendix we provide strong duality results and optimality conditions for an important special case, which generalize the previous results for single-stage multivariate CVaR-constrained problems (Noyan and Rudolf, 2013). Related results for two-stage multivariate SSD-constrained optimization are given in Dentcheva and Wolfhagen (2016). Next we present two types of algorithms to solve the computationally challenging two-stage stochastic programming model (3).

2.1 Delayed Cut Generation for Deterministic

Equivalent Formulation

The deterministic equivalent formulation (5) features finite but potentially an exponential number of scalarization vectors, obtained as projections of the vertices of the polyhedron $P$ . Hence, it is natural to develop a delayed cut generation algorithm, which avoids the impractical vertex enumeration approach required for explicitly constructing the risk constraints in advance. The proposed algorithms rely on solving successive relaxations of the multivariate polyhedral CVaR relation, and they iteratively generate cuts associated with the scalarization vectors for which the risk constraints are relaxed.

In our first cutting plane approach, at an intermediate iteration, the relaxed master problem (RMP) includes the constraint (3b) for a subset of $\mathcal{C}$ , say $\{\tilde{\mathbf{c}}_{(1)},\dots,\tilde{\mathbf{c}}_{(L)}\}$ . Accordingly, RMP takes the form of (5), where $\bar{L}$ is replaced by $L$ . Given an optimal solution to the RMP $(\bar{\mathbf{x}},\bar{\bm{\eta}},\bar{\mathbf{w}})$ , the following separation problem (SP) is solved to identify if there is any violated inequality (3b):

[TABLE]

Noyan and Rudolf (2013), Küçükyavuz and Noyan (2016), and Liu et al. (2015) provide alternative mixed-integer programming (MIP) formulations for the cut generation problem (9). Because we can treat the DEF as a large-scale single-stage multivariate CVaR-constrained problem, the convergence of the resulting delayed cut generation algorithm follows from Noyan and Rudolf (2013).

Remark 2.2

The cut generation formulations in Noyan and Rudolf (2013), Küçükyavuz and Noyan (2016) and Liu et al. (2015) are based on the opposite convention that larger values of random variables, as well as larger values of risk measures, are considered to be preferable. Those existing MIP formulations should be altered to reflect this difference in convention.

2.2 Delayed Cut Generation with Scenario Decomposition

The deterministic equivalent formulation (5) contains a vector of $\mathbf{y}_{s}$ variables for each scenario $s\in S$ and its size grows very fast as the number of scenarios increases. Because the number of scenarios is typically large, it is generally impractical to solve the DEF directly, even without the multivariate stochastic benchmarking constraints. The L-shaped method (Van Slyke and Wets, 1969), which is a Benders-type scenario decomposition algorithm, is arguably the most commonly used computationally viable method to solve the classical (risk-neutral) two-stage stochastic programs. We propose a non-trivial extension of this approach to solve our multivariate CVaR-constrained two-stage model.

In the DEF (5), constraints (5b)–(5c), which capture the CVaR relation (1b) along with the associated auxiliary variables $\bm{\eta}$ and $\mathbf{w}$ , create further coupling of the scenarios in addition to the original coupling constraint (5f). Considering this structure, we handle the $\mathbf{x}$ , $\bm{\eta}$ , and $\mathbf{w}$ variables in the first stage, decompose the second-stage problems over scenarios once the first-stage variables are fixed, and solve iteratively the RMP involving only the first-stage decision variables and auxiliary decision variables ( $\theta_{s},~{}s\in S$ ) for approximating the optimal second-stage objective function values. We obtain the following RMP at an intermediate iteration, where a subset of the scalarization vectors of cardinality $L$ is generated so far:

[TABLE]

Here $\mathcal{O}$ is a polyhedral set defined by the constraints (referred to as the optimality cuts) that give valid lower bounding approximations of $\bm{\theta}$ , and $\mathcal{F}$ is a polyhedral set defined by the constraints (referred to as the feasibility cuts) that represent the conditions for the first-stage decision vectors to yield feasible second-stage problems. In what follows, we give the explicit forms of the inequalities in the sets $\mathcal{O}$ and $\mathcal{F}$ .

We start the algorithm with a small subset (could be empty) of the scalarization set $\mathcal{C}$ . At each iteration, given the subset $\{\tilde{\mathbf{c}}_{(1)},\dots,\tilde{\mathbf{c}}_{(L)}\}$ , we first solve the relaxed master problem $(\operatorname{MP}^{L})$ to obtain an optimal solution $\bar{\mathbf{v}}:=(\bar{\mathbf{x}},\bar{\bm{\eta}},\bar{\mathbf{w}},\bar{\bm{\theta}})$ . Using this solution of the RMP, we solve the second-stage subproblem (PS ${}_{s}^{L}$ ) for each scenario $s\in S$ given by

[TABLE]

where constraint (11b) is equivalent to constraint (5c) after partial substitution of the values of the first-stage variables with $\bar{\mathbf{x}},\bar{\bm{\eta}}$ and $\bar{\mathbf{w}}$ . Let $\beta_{sl},~{}l=1,\ldots,L,$ and $\bm{\gamma}_{s}$ for $s\in S$ , be the dual variables associated with constraints (11b) and (11c), respectively. If the primal subproblem (PS ${}_{s}^{L}$ ) is infeasible for some $s\in S$ , then let $(\bm{\gamma}_{s},\bm{\beta}_{s})$ provide an extreme ray (direction of unboundedness) of the corresponding dual feasible region and add the following Benders feasibility cut to the set $\mathcal{F}$ :

[TABLE]

If the primal subproblem is feasible and the current estimate of the optimal second-stage objective value ( $\bar{\theta}_{s}$ ) is less than the actual optimal second-stage objective value ( $\bar{Q}_{s}^{L}$ ), then let $(\bm{\gamma}_{s},\bm{\beta}_{s})$ provide an optimal dual extreme point and add the following Benders optimality cut to the set $\mathcal{O}$ :

[TABLE]

Now suppose that all primal subproblems are feasible with the finite objective values $\bar{Q}_{s}^{L},s\in S$ (recall our assumption that the original problem is not unbounded). In this case, given $(\bar{\mathbf{x}},\bar{\mathbf{y}})$ , we solve the separation problem (9) to obtain an optimal solution $\mathbf{c}^{*}$ . If the corresponding optimal objective value is non-positive, then there is no violation in the CVaR constraint (3b). Furthermore, recall that in this case, all second-stage problems are feasible as well. In other words, the current first-stage solution is feasible for the original problem, so we update the upper bound on the optimal objective value (denoted by $u$ ). On the other hand, if the optimal objective value of the SP is positive, then $\mathbf{c}^{*}\in\mathcal{C}$ creating the most violation in constraint (11b) is added as the $L+1$ th scalarization vector both to the RMP and the second-stage subproblems with $L\leftarrow L+1$ . We repeat these iterations until either an infeasible RMP is detected, or the RMP objective function value is within the given tolerance $\epsilon$ of the upper bound $u$ . The optimality tolerance is specified as a very small positive number such as $10^{-10}$ . The pseudo-code of the proposed decomposition algorithm is provided in Algorithm 1. Note that the feasible regions of the dual subproblems change depending on the scalarization vectors included in the formulations at that iteration. In addition, the RMP also grows both in terms of the number of variables and constraints as we add more scalarization vectors to its formulation due to the addition of the $\eta_{l}$ and $\mathbf{w}_{l}$ variables for each scalarization vector $\tilde{\mathbf{c}}_{(l)}$ . As a result, our proposed algorithm can be seen as a delayed column and cut generation algorithm.

Proposition 2.2

Suppose that the relaxed master problem $(\operatorname{MP}^{L})$ is given by (10), the second-stage subproblems (PS ${}_{s}^{L}$ ) are given by (11), the separation problem is given by (9), the feasibility and optimality cuts, respectively, are given by (12) and (13), and the first-stage solution vector is given by $\bar{\mathbf{v}}=(\bar{\mathbf{x}},\bar{\bm{\eta}},\bar{\mathbf{w}},\bar{\bm{\theta}})$ . Then, Algorithm 1 provides either an optimal solution to problem (1) or a proof of infeasibility in finitely many iterations.

Proof.

First, observe that the optimality cuts (13) generated from subproblems (PS ${}_{s}^{L}$ ) for $s\in S$ and the subset $\{\tilde{\mathbf{c}}_{(1)},\dots,\tilde{\mathbf{c}}_{(L)}\}$ of $\mathcal{C}$ are valid, because subproblems (PS ${}_{s}^{L}$ ) are relaxations of the original subproblems (8) given by (PS ${}_{s}^{\bar{L}}$ ), and hence the duals of the relaxed subproblems provide valid lower bounds on the optimal second-stage objective values. Similarly, the feasibility cuts (12) obtained from the relaxed subproblems are valid for the original subproblems (8). At an intermediate iteration, if there is a violation in the CVaR constraint (3b), then the exact solution of the SP guarantees that a corresponding violating scalarization vector is identified and the associated inequalities and variables are added to the RMP. Furthermore, Noyan and Rudolf (2013) provide a procedure to guarantee that the $\mathbf{c}^{*}$ vector generated from the SP is a vertex solution as defined in Proposition 2.1. Next, recall that the number of scalarization vectors of interest, $\bar{L}$ , is finite. Hence, in the worst case, in a finite number of iterations, all $\bar{L}$ scalarization vectors will be added, and the second-stage problems become exact. At this point, the convergence of the algorithm follows directly from the convergence of the classical L-shaped method Van Slyke and Wets (1969). ∎

3 Two-Stage Optimization with a Multivariate Stochastic

Ordering

The solution algorithms proposed in Section 2 can also be applied to the two-stage stochastic programs with the multivariate preference constraints based on increasing convex order (ICO). The counterpart of ICO in the opposite convention, where larger values of random variables are preferred, is referred to as the SSD. Similar to the CVaR case, the univariate ICO relation can be extended to the multivariate case by considering a family of linear scalarization functions; more specifically, a random vector $\mathbf{X}$ is said to dominate another random vector $\mathbf{Z}$ with respect to the ICO and the scalarization set $\mathcal{C}$ if

[TABLE]

The risk-averse two-stage optimization model of Dentcheva and Wolfhagen (2016) utilizes this multivariate stochastic preference relation to compare the decision-based random outcome vector $\hat{\mathbf{G}}(\mathbf{x},\mathbf{y})$ with the benchmark outcome vector $\mathbf{Z}$ . In particular, the two-stage stochastic programming model with the multivariate ICO constraints takes the form of (3), where the multivariate CVaR relation (3b) is replaced by (14) with $\mathbf{X}=\hat{\mathbf{G}}(\mathbf{x},\mathbf{y})$ . Assuming again that $\mathbf{Z}$ has finitely many realizations $\mathbf{z}_{1},\ldots,\mathbf{z}_{|T|}$ with associated probabilities $\tilde{p}_{i},~{}i\in T$ , the inequality (14) can be equivalently stated as

[TABLE]

Relation (15) directly follows from (14) due to the well-known result (Dentcheva and Ruszczyński, 2003) that for finite probability spaces the univariate SSD relation remains valid if the corresponding expected shortfall constraints (in our convention, constraints (14) featuring the expected excess values for a particular $\mathbf{c}$ ) only required for the realizations of the benchmark random variable (instead of all $\eta\in\mathbbm{R}$ ).

For finite probability spaces, Homem-de-Mello and Mehrotra (2009) show that it is sufficient to consider a finite number of scalarization vectors $\{\hat{\mathbf{c}}_{(1)},\dots,\hat{\mathbf{c}}_{(\bar{L})}\}$ in relation (15), specifically the projections of the extreme points of polyhedra $P_{t},~{}t\in T$ , defined for each $t\in T$ as $P_{t}=\{w_{i}\geq\mathbf{c}^{\top}\mathbf{z}_{i}-\mathbf{c}^{\top}\mathbf{z}_{t},\ i\in T,\ \mathbf{c}\in\mathcal{C},\ \mathbf{w}\in\mathbbm{R}^{|T|}_{+}\}$ . For an alternative finite representation of (15), which is based on the vertices of the polyhedron $P$ defined in Proposition 2.1, we refer to Noyan and Rudolf (2013). The fact that these finite representations of (15) are only characterized by the given benchmark vector allows us to develop computationally tractable DEFs of the optimization models featuring the multivariate ICO relation as a benchmarking constraint. In particular, for the two-stage multivariate ICO-constrained problem we can obtain the following DEF with finitely many constraints:

[TABLE]

Although the number of inequalities (16b)–(16c) is finite, it is possibly exponential. Hence, as in the $\operatorname{CVaR}$ -constrained optimization, the ICO constraints can be added as needed using a delayed cut generation approach. As in Section 2.1, we start with solving the DEF with a subset of constraints (16b)–(16c) for $l=1,\ldots,L$ . For a given solution $\bar{\mathbf{v}}:=(\bar{\mathbf{x}},\bar{\mathbf{w}},\bar{\mathbf{y}})$ to this relaxed problem, we check if there is a violation in (15) (with $\mathbf{X}=\hat{\mathbf{G}}(\bar{\mathbf{x}},\bar{\mathbf{y}})$ ). Note that constraints (15) are defined for each scalarization vector and for each realization of $\mathbf{Z}$ , as such possible violations in constraints (15) should be checked for all $\mathbf{z}_{t},~{}t\in T$ . In line with these discussions, given $(\bar{\mathbf{x}},\bar{\mathbf{w}},\bar{\mathbf{y}})$ , the separation problem corresponding to the $t^{th}$ realization of $\mathbf{Z}$ becomes

[TABLE]

This cut generation problem can be solved using the adaptations of the MIP formulations provided in Homem-de-Mello and Mehrotra (2009), Küçükyavuz and Noyan (2016), and Noyan and Rudolf (2016) for the opposite convention that larger outcomes are preferred.

As in the $\operatorname{CVaR}$ case, the structure of the ICO constraints can be exploited to further decompose the formulation (16) over scenarios. Observe that constraints (16b) and (5f) couple the scenarios together. However, if we handle the original first-stage variables $\mathbf{x}$ and the auxiliary $\mathbf{w}$ variables in the first stage, then the second-stage problems decompose over scenarios once the first-stage variables are fixed. Therefore, we can apply Algorithm 1 with the updated definitions of the problems $(\operatorname{MP}^{L})$ , (PS ${}_{s}^{L}$ ), and (SP), the feasibility and optimality cuts, and the first-stage solution vector $\mathbf{v}$ , which we describe next.

The RMP at an intermediate iteration, where a subset of the scalarization vectors of cardinality $L$ is generated so far, is formulated as

[TABLE]

Given a first-stage solution vector $\bar{\mathbf{v}}=(\bar{\mathbf{x}},\bar{\mathbf{w}},\bar{\bm{\theta}})$ of RMP and the $L$ scalarization vectors, the second-stage subproblem under $s\in S$ takes the form of

[TABLE]

Here, constraint (19b) is equivalent to constraint (16c) with partial substitution of the values of the first-stage variables $\bar{\mathbf{x}}$ and $\bar{\mathbf{w}}$ . Denoting the dual vectors associated with constraints (19b) and (11c) by $\beta_{stl},t\in T,l=1,\dots,L,$ and $\bm{\gamma}_{s},~{}s\in S,$ the feasibility and optimality cuts are given by

[TABLE]

Proposition 3.1

Suppose that the relaxed master problem $(\operatorname{MP}^{L})$ is given by (18), the second-stage subproblems (PS ${}_{s}^{L}$ ) are given by (19), the separation problem is given by (17), the feasibility and optimality cuts, respectively, are given by (20) and (21), and the first-stage solution vector is given by $\bar{\mathbf{v}}=(\bar{\mathbf{x}},\bar{\mathbf{w}},\bar{\bm{\theta}})$ . Then, Algorithm 1 provides either an optimal solution to problem (16) or a proof of infeasibility in finitely many iterations.

Proof.

The proof is similar to that of Proposition 2.2. ∎

Finally, we highlight four major advantages of our proposed solution framework over the existing methods (Dentcheva and Wolfhagen, 2016). First, our method reformulates the risk-averse model of interest as a risk-neutral two-stage stochastic program without using the Lagrangian relaxation of the multivariate ICO constraints, and consequently, manages to solve the problem within a single Benders decomposition framework. It maintains a single master problem to solve successively redefined (based on the iteratively enlarged subset of the scalarization vectors) risk-neutral relaxations of the problem. Recall that using a subset of $\mathcal{C}$ provides a relaxation of the problem. These risk-neutral formulations are constructed by using the decomposable structure of the ICO constraints over scenarios. Dentcheva and Wolfhagen (2016) also construct successive relaxations by considering a subset of $\mathcal{C}$ ; however, they obtain approximate risk-neutral relaxations by using the Lagrangian relaxation of the multivariate ICO constraints. In particular, at each iteration of their algorithms, a risk-neutral approximation with an updated Lagrangian objective function is solved (to calculate the dual function given a set of Lagrangian multipliers) using decomposition methods such as Benders decomposition. The need for solving a separate risk-neutral two-stage model at each iteration, in addition to the well-known computational challenges in solving the non-differentiable Lagrangian dual problem, could impose significant computational difficulties. Second, the Lagrangian-based existing algorithms are not exact even for the finite probability spaces; they are shown to finitely converge to an $\varepsilon$ -feasible (with respect to the ICO constraint) $\varepsilon$ -optimal solution both for continuous and finite probability spaces. In contrast, our finitely-convergent algorithm provides an exact (optimal and feasible) solution for finite probability spaces. Third, the existing algorithms do not provide valid upper bounds on the optimal objective value at intermediate stages, because the Lagrangian dual at an iteration is constructed using the information on a subset of $\mathcal{C}$ . In contrast, Algorithm 1 provides valid upper bounds (in line 20) at intermediate stages. Fourth, our method applies to problems that contain discrete variables in the first stage. In contrast, the Lagrangian-based methods cannot guarantee the integer feasibility (see the primal solution recovery step of the algorithms in Dentcheva and Wolfhagen, 2016). In summary, we contribute to the literature by providing a new computationally tractable and exact solution algorithm for the multivariate ICO-constrained two-stage models with integer variables in the first stage under finite probability spaces.

4 A Stochastic Optimization Model for Pre-disaster Relief Network Design

An effective and sound pre-disaster relief network design calls for modeling the risk associated with the high level of uncertainty inherent in rarely occurring disaster events (see, e.g., Noyan, 2012) and considering multiple and possibly conflicting performance criteria such as responsiveness and equity (see, e.g., Vitoriano et al., 2011; Huang et al., 2012; Gutjahr and Nolz, 2016). Moreover, a two-stage decision making framework is beneficial in this context, since the pre-positioning design decisions must be made before a disaster strikes, whereas the relief distribution decisions should be made in the post-disaster stage. Motivated by the significance of developing such effective pre-disaster plans (see, e.g., Balcik and Beamon, 2008; Salmerón and Apte, 2010), we apply our proposed approach to a stochastic pre-disaster relief network design problem. In this section, we first briefly review the relevant literature, and then describe the problem setting and present the corresponding mathematical programming formulations.

There is a rich and growing literature on developing stochastic programming models for humanitarian logistics (e.g., Çelik et al., 2012; Liberatore et al., 2013). The majority of the existing studies focus on pre-disaster relief network design and develop risk-neutral two-stage stochastic programs (e.g., Balcik and Beamon, 2008; Rawls and Turnquist, 2010; Salmerón and Apte, 2010; Döyen et al., 2012). However, this extensive literature includes only a few studies (e.g., Rawls and Turnquist, 2011; Noyan, 2012; Hong et al., 2015; Elçi, 2016) that provide risk-averse stochastic models, and lacks models with multivariate risk constraints. Here, we mainly discuss the studies that are most closely related to ours. Rawls and Turnquist (2010) develop a risk-neutral two-stage stochastic programming model; in the first-stage it determines the cost-wise optimal decisions on locations and capacities of the response facilities, and the inventory levels of relief supplies under uncertainty in demand, damage levels of pre-stocked supplies, and transportation network conditions. Their second-stage problem is formulated as a classical network flow model, which involves detailed distribution decisions representing the flow of relief supplies on each arc. Noyan (2012) obtains a risk-averse version of this model by incorporating CVaR as the risk measure on the total cost in addition to its expectation. There also exist chance-constrained variants (Rawls and Turnquist, 2011; Hong et al., 2015). A recent study Elçi (2016) proposes a more elaborate risk-averse extension, which features a mean-risk objective function on the random total cost with CVaR being the risk measure (as in Noyan, 2012), and enforces a joint chance constraint on the feasibility of the second-stage problem (as in Hong et al., 2015). In contrast to the other variants, the model of Elçi (2016) relies on an alternative formulation of the second-stage problem, which focuses on assigning the demand points to the facilities instead of determining the detailed arc-flow decisions. In our study, we follow this practically meaningful assignment-based modeling approach for the second-stage problem, and develop a new risk-averse variant of the widely-cited model of Rawls and Turnquist (2010).

Problem Description and Mathematical Formulations. We extend the model of Rawls and Turnquist (2011) by incorporating the multivariate CVaR relation into the first-stage to evaluate the decisions based on additional multiple stochastic criteria. In particular, we consider the following additional stochastic measures: the maximum proportion of unsatisfied demand and a responsiveness measure based on the total delivery amount-based average travel time. With the term responsiveness we refer to the expeditiousness of the response. A large variety of criteria have been employed in humanitarian logistics and they can be grouped in three main categories (see, e.g., Huang et al., 2012; Gutjahr and Nolz, 2016): efficiency (related to cost), efficacy or effectiveness (related to providing quick and sufficient distribution) and equity (related to providing a fair service). According to this classification, our model addresses the issues about efficiency and efficacy of the relief operations using a weighted-sum based objective, which aggregates the costs of opening facilities, demand shortages, and purchasing and shipping the relief supplies. In addition, it addresses the issues about responsiveness and equity (in terms of supply allocation) via the multivariate CVaR constraints. The CVaR-based relation is preferred, since it is a natural relaxation of its SSD-counterpart. Multivariate SSD constraints are typically very demanding and they can be excessively hard to satisfy in practice.

We model the relief distribution system using a network, where $I$ and $J$ denote the sets of nodes representing the demand and candidate facility locations, respectively; we assume without loss of generality that $J\subseteq I$ . The set $L$ denotes the multiple facility types, a facility of type $l$ has a given capacity level $K_{l}$ . Setting up a facility of type $l$ at node $j$ has a fixed cost $F_{jl}$ and a unit variable acquisition cost $a_{j}$ . We consider a single commodity (as in, e.g., Hong et al., 2015), since the relief items can be supplied as bundles. The demand values, travel times, demand shortage penalty costs, and damage levels of supplies are assumed to be random and their realizations are represented by a finite set of scenarios $S$ with probabilities $p_{s},~{}s\in S$ . We next introduce notation for the realizations of these random parameters under scenario $s$ : demand at node $i$ is $d_{i}^{s}$ , travel time from node $j$ to node $i$ is $\nu_{ji}^{s}$ , unit shortage penalty cost is $\pi^{s}$ , unit shipment cost from node $j$ to node $i$ is $c_{ji}^{s}$ , proportion of undamaged supplies at node $j$ is $\gamma_{j}^{s}$ . In contrast to Rawls and Turnquist (2010), each demand location can be served only by a facility node, which satisfies a responsiveness requirement. In particular, we consider a common upper bound on travel times to construct the scenario-dependent coverage sets: $N^{s}_{i}$ and $M^{s}_{j}$ , respectively, denote the set of facility nodes that can cover demand node $i$ and the set of demand nodes that can be served by facility node $j$ under scenario $s$ . Enforcing a common threshold serves the objective of providing equitable service in terms of response times.

In our two-stage decision making framework, the first-stage decisions are represented using the following notation: $x_{jl}=1$ if a facility of size category $l\in L$ is located at node $j\in J$ , and $x_{jl}=0$ otherwise; $R_{j}$ denotes the amount of supplies pre-located at node $j\in J$ . The notation for the second-stage decisions under scenario $s\in S$ is as follows: $y^{s}_{ji}$ denotes the amount of supplies delivered to demand node $i\in I$ from node $j\in J$ ; $u^{s}_{i}$ designates the amount of the unsatisfied demand at node $i\in I$ ; $u^{s}_{\max}$ is the maximum proportion of unsatisfied demand (maximum is calculated over the demand nodes), i.e., $u^{s}_{\max}=\max_{i\in I}u^{s}_{i}/d_{i}^{s}$ . We next present our risk-averse two-stage stochastic programming model, where the first-stage problem is formulated as follows:

[TABLE]

Here, $\bm{\xi}(\omega_{s})=(\mathbf{d}^{s},\bm{\gamma}^{s},\mathbf{c}^{s},\bm{\nu}^{s},\pi^{s})$ is the vector of the random parameter realizations under scenario $s\in S$ . Given the first-stage decision vectors $\mathbf{x}$ and $\mathbf{R}$ , the optimal second-stage objective value under scenario $s$ , $Q(\mathbf{x},\mathbf{R},\bm{\xi}(\omega_{s}))$ , is calculated by solving the following second-stage problem under scenario $s$ :

[TABLE]

The objective, defined by (22a) and (23a), is to minimize the expected total cost of opening facilities, purchasing and shipping the relief supplies, and demand shortages. As discussed in Section 2, constraint (22b) ensures that the decision-based random outcome vector $\mathbf{G}(\mathbf{x},\mathbf{R},\bm{\xi}(\omega))$ is preferable to the benchmark outcome $\mathbf{Z}$ according to the multivariate polyhedral CVaR relation. Here $\mathbf{G}(\mathbf{x},\mathbf{R},\bm{\xi}(\omega_{s}))=\hat{\mathbf{g}}_{s}(\mathbf{x},\mathbf{R},\mathbf{y}^{s},\mathbf{u}^{s},u_{\max}^{s})$ is a two-dimensional vector, the components of which are given by

[TABLE]

The maximum proportion of unsatisfied demand (24) is crucial for equity in terms of supply allocation, whereas the second measure (25), the total delivery amount-based average travel time score (ATS), is related to responsiveness. The delivery-based unit average travel time to demand node $i$ could ideally be calculated by replacing $d_{i}^{s}$ with the corresponding exact delivery amount (i.e., $y_{ji}^{s}$ ) in (25); we prefer not to use this calculation to avoid nonlinear terms. Moreover, (25) relies on a scaling with an upper bound ( $\max\limits_{j\in N^{s}_{i}\setminus\{i\}}\nu_{ij}^{s}$ ) of $\operatorname{ATS}_{i}$ ; this scaling ensures that both measures take values between 0 and 1 and prevents biased solutions due to the differences in the magnitude of the outcomes. We also note that considering a set of scalarization vectors in (22b) is essential to deal with the ambiguities and inconsistencies in the weight vectors. Eliciting the relative weights of even a single decision maker/expert can be challenging (for a related review, see, e.g., Liu et al., 2015), and is further exacerbated if multiple experts are involved as in the case of humanitarian logistics applications.

We next elaborate on the remaining constraints of the proposed model. Constraints (22c) ensure that at most one facility is opened at any node $j\in J$ . Constraints (22d) guarantee that the inventory levels at open facilities do not exceed the corresponding capacity limits and there is a facility located at node $j$ if its inventory level is positive. Constraints (23b)-(23e) correspond to a restricted version of the classical network flow formulation, which provides a more structured flow considering the coverage issues. More specifically, these constraints enforce that the nodes without any pre-stocked inventory receive service directly from the open facilities that are sufficiently close (according to the upper bound on travel times). Constraints (23d) ensure that the amount delivered from a facility node does not exceed its available inventory level and outgoing flow is not possible if there is no facility located. Constraints (23e) assure that there is no delivery to a node if a facility is located at that node, and the amount of delivery is limited by its demand, otherwise. Constraints (23f) calculate the maximum proportion of demand shortage.

We develop computationally effective solution methods for the model (22)-(23) by applying our cut generation-based algorithms presented in Section 2. Observing that the original second-stage problem (23) is always feasible, it is interesting to note that our scenario decomposition-based algorithm may need to generate the Benders feasibility cuts (12). This is the case since the algorithm solves iteratively modified versions of (23) with the constraints of type (11b), which may lead to infeasibility in a second-stage subproblem.

5 Computational Study

In this section, we illustrate the computational effectiveness of the two types of algorithms proposed in Section 2 by implementing them to solve the two-stage model (22)-(23) under varying parameter settings. We first describe how the test problem instances are generated and then provide numerical results.

We used the data sets of Elçi (2016), which are based on a case study concerning the threat of hurricanes in the Southeastern part of the United States (Rawls and Turnquist, 2010); in this case study, the region is represented by a network with 30 nodes and 55 arcs, and only a single set of hurricane scenarios with a cardinality of 51 is considered. In order to generate instances with a larger number of scenarios, Hong et al. (2015) propose a scenario generation method which takes into account the dependency structures inherent in disaster relief networks. Their approach randomly identifies the link degradation and node damage levels depending on the proximity to the center of the disaster, and generates the realizations of the random input parameters accordingly. Elçi (2016) follows this elaborate scenario generation approach, but makes the necessary modifications according to their assignment-based formulation of the second-stage problem. More specifically, the author ignores 55 links of the original network and introduces the links according to the coverage-based responsiveness requirement. In our computational study, we consider the most dense (fully-connected) network structure to distribute water (in units of 1000 gallons). We refer the reader to Hong et al. (2015) and Elçi (2016) for further details on the values of the relevant parameters, such as cost, facility capacity limits, base values of the travel times and water demand, and the random distribution information, etc. Finally, we note that all scenarios are assumed to be equally likely.

We next discuss the additional issues that arise due to the different features of our new risk-averse model. The stochastic benchmarking constraints of interest additionally require us to specify a benchmark random outcome vector and a scalarization set. For illustrative purposes, we obtain a benchmark vector by identifying a reasonable and feasible benchmark first-stage solution and calculating the performance measures of interest associated with the corresponding optimal second-stage decisions. In particular, we consider the first-stage solution, where a single large-sized facility with the maximum possible inventory level (5394 units) is located at Houston, Baton Rouge, Biloxi, Atlanta and Orlando. In real life applications, a benchmark decision vector can be constructed by considering the existing practices of relief organizations. In addition, the scalarization set is assumed to have an ordered preference structure as follows: $\mathcal{C}=\{\mathbf{c}\in\mathbbm{R}_{+}^{2}:c_{1}+c_{2}=1,c_{2}\geq c_{1}\}$ , where $c_{1}$ and $c_{2}$ correspond to the relative importance of the demand satisfaction criterion (24) and responsiveness criterion (25), respectively. This particular scalarization set description can be easily modified according the preferences of the decision makers.

Computational Performance of the Solution Methods. We evaluate the computational performance of the proposed solution algorithms in terms of the run times and the number of operations performed at each iteration.

All experiments are performed on a single thread of a Windows server with Intel(R) Xeon(R) CPU E5-2630 processor at 2.40 GHz and 32 GB of RAM using Java and Cplex 12.6.0. CPLEX is invoked with a time limit of 3600 seconds. MIP gap and feasibility tolerances are set to $10^{-5}$ and $10^{-9}$ , respectively; otherwise, we use the default CPLEX settings. The results are obtained for the instances with $\alpha=0.99,0.95,0.90$ and up to 1500 scenarios. For each setting, we report the average over three instances. We implement the cutting plane algorithms using the lazy constraint callback function of CPLEX. Considering the fact that using callbacks prevents CPLEX from utilizing dynamic search, we specify the MIP search method for both algorithms as traditional branch-and-cut. Following an acceleration technique suggested in Bodur and Luedtke (2016), we initialize the RMP of the scenario decomposition-based algorithm with optimality and feasibility cuts associated with an initial first-stage solution to benefit from the possible improvement obtained from the automatically created CPLEX cuts. In particular, our initial solution locates two facilities at Mobile and Orlando with the smallest size. Additionally, we employ the user cut callback function of CPLEX for adding the Benders cuts at the root node for the fractional incumbent solutions. The usercut callback is invoked at each iteration at the root node until the improvement in the relative MIP gap is less than $10^{-6}$ for the last five consecutive iterations.

In Table 1, we report several statistics on the performance of the delayed cut generation algorithm for DEF (DCG-DEF) and its scenario decomposition-based counterpart (DCG-SD). Under the “Time” columns, the solution times for only the instances solved within the time limit are reported. For both solution algorithms, we report the total run time and the time spent solving the separation problems under columns “Total/[%gap]”, and “Sep.”, respectively. If the algorithm terminated at the time limit with a feasible solution for some replications of the instance, then the average of the relative optimality gaps are reported under the column “Total/[%gap]”, inside square brackets. The relative optimality gap reported by CPLEX corresponds to the relative gap between the objective function value for the best available integer solution ( $z^{IP}$ ) and the best LP relaxation bound at the time of termination ( $z^{LP}$ ), i.e. $100|z^{IP}-z^{LP}|/z^{IP}$ . For both algorithms, we report the average number of separation problems solved and the average number of scalarization vectors added under columns “Sep.” and “ $L$ ”, respectively. For DCG-SD, we report the average total number of cuts ((10d), (12), and (13)) added to the master problem under column “Cuts”. We do not report the number of cuts for DCG-DEF, because in this case, constraints of type (5b) and (5c) are added when a new scalarization vector is included in the RMP. Hence, the total number of cuts added to the RMP for DCG-DEF is $(|S|+1)L$ , where $|S|$ is the number of scenarios and $L$ is the number of scalarization vectors generated.

The results show that DCG-SD outperforms DCG-DEF in terms of the computation times and number of instances that can be solved to optimality. Out of 63 instances, DCG-DEF is able to find an optimal solution for only 35 instances and it terminates with a feasible solution after one hour for seven instances. On the other hand, DCG-SD provides at least a feasible solution for all instances even if it fails to prove optimality for five of them. Considering the instances that can be solved to optimality by both methods, it can be seen that DCG-SD proves optimality in a shorter amount of time than DCG-DEF.

We observe that unlike the single-stage multivariate CVaR-constrained problems in Noyan and Rudolf (2013), the separation problem is no longer the bottleneck taking over 95% of the solution time. In fact, for DCG-DEF, the time spent on separation is negligible when compared to the overall solution time, with a small number of separation problems solved. On the other hand, a large number of separation problems are solved in DCG-SD, hence it is important to use the strongest formulations available for the separation problem (i.e., those provided in Liu et al., 2015 for the equiprobable scenario case) to reduce the time spent on generating the scalarization vectors. Nevertheless, a few instances cannot be solved with DCG-SD within the time limit mainly due to the long solution times of the separation MIPs. However, for these instances, our algorithm is able to find a feasible solution during its course, and hence it provides optimality gaps at termination. To avoid long solution times, one can implement heuristic separation methods. In addition, if $\varepsilon$ -feasible solutions are satisfactory, then the violation thresholds can be updated accordingly.

We close this section by noting that the problem sizes we consider are significantly larger (in terms of first- and second-stage variables and constraints and/or the number of scenarios) when compared to the data sets in Noyan and Rudolf (2013) and Dentcheva and Wolfhagen (2015) for related optimization problems with multivariate stochastic benchmarking constraints. In addition, our first-stage problem involves discrete decisions, which results in the most difficult data sets considered for this problem class to date. In conclusion, our computational experiments showcase the scalability of the proposed scenario decomposition-based method with respect to the number of scenarios.

Acknowledgements

Simge Küçükyavuz and Merve Meraklı are supported, in part, by National Science Foundation Grant 1537317. Nilay Noyan acknowledges the support from The Scientific and Technological Research Council of Turkey under grant #115M560.

Appendix A. Duality Results

In this section, we develop duality formulations and optimality conditions for the important special case of the multivariate CVaR-constrained two-stage model (1)-(2), where the mapping $f$ is linear, the set $\mathcal{X}$ is a polyhedral set, and the first-stage decisions are continuous. Let $\mathcal{X}=\{\mathbf{x}\in\mathbbm{R}_{+}^{n_{1}}~{}:~{}A\mathbf{x}\leq\mathbf{b}\}$ and $f(\mathbf{x})=\mathbf{f}^{\top}\mathbf{x}$ for some matrix $A\in\mathbbm{R}^{m_{1}\times n_{1}}$ , and the vectors $\mathbf{f}\in\mathbbm{R}^{n_{1}}$ and $\mathbf{b}\in\mathbbm{R}^{m_{1}}$ . Then, the model of interest becomes

[TABLE]

For a finite set $\tilde{\mathcal{C}}=\{\tilde{\mathbf{c}}_{(1)},\ldots,\tilde{\mathbf{c}}_{(L)}\}$ we consider the following LP, referred to as $(\operatorname{FiniteP}(\tilde{\mathcal{C}}))$ :

[TABLE]

The next lemma shows that $(\operatorname{FiniteP}(\tilde{\mathcal{C}}))$ is equivalent to $(\operatorname{LinearP})$ for suitable choices of $\tilde{\mathcal{C}}$ ; it is a simple consequence of Proposition 2.1.

Lemma .1

Let $\hat{\mathcal{C}}=\{\hat{\mathbf{c}}_{(1)},\ldots,\hat{\mathbf{c}}_{(L)}\}$ as defined in Proposition 2.1, and assume that the finite set $\tilde{\mathcal{C}}$ satisfies $\hat{\mathcal{C}}\subset\tilde{\mathcal{C}}\subset\mathcal{C}$ . Then a vector $(\mathbf{x},\mathbf{y})$ is a feasible (optimal) solution of $(\operatorname{LinearP})$ if and only if $(\mathbf{x},\mathbf{y},\bm{\eta}^{(\mathbf{x},\mathbf{y})},\mathbf{w}^{(\mathbf{x},\mathbf{y})})$ is a feasible (optimal) solution of $(\operatorname{FiniteP}(\tilde{\mathcal{C}}))$ , where $\eta^{(\mathbf{x},\mathbf{y})}_{\ell}=\operatorname{VaR}_{\alpha}(\tilde{\mathbf{c}}_{(\ell)}^{\top}\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}))$ and $w^{(\mathbf{x},\mathbf{y})}_{i\ell}=[\eta^{(\mathbf{x},\mathbf{y})}_{\ell}-\tilde{\mathbf{c}}_{(\ell)}^{\top}\hat{\mathbf{g}}_{s}(\mathbf{x},\mathbf{y}_{s})]_{+}$ .

The LP formulation $(\operatorname{FiniteP}(\tilde{\mathcal{C}}))$ does not immediately offer a tractable solution approach, however it leads to a natural delayed cut generation algorithm as discussed in Section 2. Moreover, $(\operatorname{FiniteP}(\tilde{\mathcal{C}}))$ provides an important direct way to derive strong duality results and optimality conditions, presented below. First, recall the following notation: $\hat{\mathbf{g}}_{s}(\mathbf{x},\mathbf{y}_{s})=(\mathbf{g}_{s}^{1}(\mathbf{x},\mathbf{y}_{s}),\mathbf{g}_{s}^{2}(\mathbf{x},\mathbf{y}_{s}),\ldots,\mathbf{g}_{s}^{d}(\mathbf{x},\mathbf{y}_{s}))^{\top},~{}s\in S,$ and $\mathbf{g}_{s}^{i}(\mathbf{x},\mathbf{y}_{s})=\mathbf{\bar{g}}_{s}^{i}\mathbf{x}+\mathbf{\tilde{g}}_{s}^{i}\mathbf{y}_{s}$ , $i\in\{1,\ldots,d\},~{}s\in S$ . By abuse of notation, we introduce the random matrices $\bar{\mathbf{g}}~{}:~{}\varOmega\rightarrow\mathbbm{R}^{d\times n_{1}}$ and $\tilde{\mathbf{g}}~{}:~{}\varOmega\rightarrow\mathbbm{R}^{d\times n_{2}}$ , where $\bar{\mathbf{g}}^{\top}(\omega_{s})=\bar{\mathbf{g}}_{s}^{\top}$ and $\tilde{\mathbf{g}}^{\top}(\omega_{s})=\tilde{\mathbf{g}}_{s}^{\top}$ with the $i$ th columns defined by $\mathbf{\bar{g}}_{s}^{i}$ and $\mathbf{\tilde{g}}_{s}^{i}$ , respectively. Denoting the set of all finitely supported finite non-negative measures on the scalarization polyhedron $\mathcal{C}$ by $\mathcal{M}_{+}^{F}(\mathcal{C})$ , we obtain the following dual problem to $(\operatorname{LinearP})$ :

[TABLE]

The proof of the next duality theorem, which is similar to that of Theorem 3 in Noyan and Rudolf (2013), is omitted here for the sake of brevity. It mainly relies on Lemma .1 and the following facts for a finitely supported measure $\mu\in\mathcal{M}_{+}^{F}(\mathcal{C})$ : $\operatorname{support}(\mu)=\left\{\mathbf{c}\in\mathcal{C}~{}:~{}\mu\left(\{\mathbf{c}\}\right)>0\right\}$ and $\int_{\mathcal{C}}u(\mathbf{c})\,\mu(\mathrm{d}\mathbf{c})=\sum_{\mathbf{c}\in\operatorname{support}(\mu)}u(\mathbf{c})\mu\left(\{\mathbf{c}\}\right)$ for a function $u:\mathcal{C}\rightarrow\mathbbm{R}$ .

Theorem .1

The problem $(\operatorname{LinearP})$ has a finite optimum value if and only if $(\operatorname{LinearD})$ does, in which case the two optimum values coincide. In addition, a feasible solution $(\mathbf{x},\mathbf{y})$ of $(\operatorname{LinearP})$ and a feasible solution $(\bm{\lambda},\mu,\nu,\bm{\pi})$ of $(\operatorname{LinearD})$ are both optimal for their respective problems if and only if the following complementary slackness conditions hold:

[TABLE]

We remark that our dual formulation is analogous to Haar’s dual for semi-infinite linear programs (see, e.g., Bonnans and Shapiro, 2000). The finite representation of the multivariate CVaR relation (appearing in Proposition 2.1) proves to be useful to obtain strong duality results and optimality conditions directly from linear programming duality without the need for constraint qualifications.

Lagrangian Duality. The duality results established in Theorem .1 have a natural Lagrangian interpretation. In fact, measures on $\mathcal{C}$ are a natural choice to use as Lagrange multipliers, since the CVaR constraints in (1b) are indexed by the scalarization set $\mathcal{C}$ . Accordingly, we define the Lagrangian function $\mathcal{L}:(\mathcal{X},\mathcal{Y})\times\mathcal{M}_{+}^{F}(\mathcal{C})\rightarrow\mathbbm{R}$ as follows:

[TABLE]

Here, by abuse of notation, $\mathbf{y}\in\mathbbm{R}_{+}^{|S|n_{2}}$ represents the collection of decision vectors $\mathbf{y}_{s}\in\mathbbm{R}_{+}^{n_{2}},~{}s\in S$ , and $\mathbf{y}\in\mathcal{Y}$ refers to $\mathbf{y}_{s}\in\mathcal{Y}(\mathbf{x},\bm{\xi}(\omega_{s})),~{}s\in S$ . For the ease of exposition, we assume that the feasible sets $\mathcal{X}$ and $\mathcal{Y}$ are compact; this is a very mild assumption due to the finiteness of the first-stage and second-stage objective function values. Then, $(\operatorname{LinearP})$ is equivalent to $\min\limits_{(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})}\max\limits_{\mu\in\mathcal{M}_{+}^{F}(\mathcal{C})}\mathcal{L}(\mathbf{x},\mathbf{y},\mu)$ , while the corresponding Lagrangian dual problem is given by

[TABLE]

For any feasible solution of $(\operatorname{LinearP})$ and any $\mu\in\mathcal{M}_{+}^{F}(\mathcal{C})$ , $\mathcal{L}(\mathbf{x},\mathbf{y},\mu)\leq\mathbf{f}^{\top}\mathbf{x}+\sum_{s\in S}p_{s}\mathbf{q}^{\top}_{s}\mathbf{y}_{s}$ . The weak duality immediately follows: $\operatorname{OBF}_{P}\geq\operatorname{OBF}_{D}$ , where $\operatorname{OBF}_{P}$ and $\operatorname{OBF}_{D}$ , respectively, denote the optimum objective values of $(\operatorname{LinearP})$ and $(\operatorname{LinearD})$ . The next theorem, which is a consequence of Theorem .1, provides the strong duality result and optimality conditions.

Theorem .2

If the primal problem $(\operatorname{LinearP})$ has an optimal solution, then the dual problem $(\operatorname{LinearD})$ also has an optimal solution, and the optimal objective values coincide. A primal feasible solution $(\mathbf{x}^{*},\mathbf{y}^{*})$ and a dual feasible solution $\mu^{*}$ are simultaneously optimal if and only if they satisfy the equations

[TABLE]

Proof.

Let us first show the following claim: If the equations (30)-(31) hold for some primal feasible $(\mathbf{x}^{*},\mathbf{y}^{*})$ and dual feasible $\mu^{*}$ , then these solutions are simultaneously optimal with coinciding objective values. According to (30) and (31), we have

[TABLE]

On the other hand, the weak duality implies that $\operatorname{OBF}_{P}\geq\operatorname{OBF}_{D}$ , which proves the claim.

We next prove the claim that if $(\mathbf{x}^{*},\mathbf{y}^{*})$ is primal optimal and $\mu^{*}$ is dual optimal, then they satisfy the equations (30)-(31). To this end, we first show that for any given primal optimal solution $(\mathbf{x}^{*},\mathbf{y}^{*})$ there exists a corresponding dual feasible solution $\bar{\mu}$ such that (30) and (31) are satisfied for the choice $\mu^{*}=\bar{\mu}$ . Let us consider an optimal solution $(\mathbf{x}^{*},\mathbf{y}^{*})$ of $(\operatorname{LinearP})$ . By Theorem .1 there exists an optimal solution $(\bar{\bm{\lambda}},\bar{\mu},\bar{\nu},\bar{\bm{\pi}})$ of $(\operatorname{LinearD})$ with the same optimal objective value. According to the complementary slackness condition (i), the equality $\operatorname{CVaR}_{\alpha}(\mathbf{c}^{\top}\hat{\mathbf{G}}(\mathbf{x}^{*},\mathbf{y}^{*}))=\operatorname{CVaR}_{\alpha}(\mathbf{c}^{\top}\mathbf{Z})$ holds on the support of the measure $\bar{\mu}$ , which implies (30). To show that (31) also holds, we next prove that, substituting $\mu^{*}=\bar{\mu}$ , $\mathcal{L}(\mathbf{x},\mathbf{y},\bar{\mu})\geq\mathcal{L}(\mathbf{x}^{*},\mathbf{y}^{*},\bar{\mu})$ is valid for any primal feasible solution $(\mathbf{x},\mathbf{y})$ (and equality holds for $(\mathbf{x}^{*},\mathbf{y}^{*})$ ). For ease of exposition, let $U(\mathbf{x},\mathbf{y},\bar{\mu})=\int_{\mathcal{C}}\operatorname{CVaR}_{\alpha}(\mathbf{c}^{\top}\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}))\,\bar{\mu}(\mathrm{d}\mathbf{c})$ and $U(\mathbf{Z},\bar{\mu})=\int_{\mathcal{C}}\operatorname{CVaR}_{\alpha}(\mathbf{c}^{\top}\mathbf{Z}))\,\bar{\mu}(\mathrm{d}\mathbf{c})$ . Then, using the definition of the Lagrangian function (28) and the dual feasibility conditions (27d)-(27e) we obtain the following relations:

[TABLE]

Now observe that the complementary slackness condition (iv) implies $-\bar{\bm{\lambda}}^{\top}A\mathbf{x}\geq-\bar{\bm{\lambda}}^{\top}\mathbf{b}=-\bar{\bm{\lambda}}^{\top}A\mathbf{x}^{*}$ , as $A\mathbf{x}\leq\mathbf{b}$ holds for all $\mathbf{x}\in X$ , and the Lagrange multiplier $\bar{\bm{\lambda}}$ is nonnegative. Similarly, as $T_{s}\mathbf{x}+W_{s}\mathbf{y}_{s}\geq\mathbf{h}_{s}$ holds for all $\mathbf{y}_{s}\in\mathcal{Y}(\mathbf{x},\bm{\xi}(\omega_{s})),~{}s\in S,$ and the Lagrange multiplier $\bar{\bm{\pi}}_{s}$ is nonnegative for all $s\in S$ , the complementary slackness condition (v) implies that $\bar{\bm{\pi}}_{s}^{\top}(T_{s}\mathbf{x}+W_{s}\mathbf{y}_{s})\geq\bar{\bm{\pi}}_{s}^{\top}\mathbf{h}_{s}=\bar{\bm{\pi}}^{\top}(T_{s}\mathbf{x}^{*}+W_{s}\mathbf{y}^{*}_{s})$ . Therefore, the following relation is valid for any primal feasible solution $(\mathbf{x},\mathbf{y})$ :

[TABLE]

Observing that $U(\mathbf{Z},\bar{\mu})$ is independent of the decision vectors $\mathbf{x}$ and $\mathbf{y}$ , by (32) and (33), it is sufficient to show that the inequality $-\mathbbm{E}\left(\int_{\mathcal{C}}\mathbf{c}^{\top}\bar{\mathbf{g}}\mathbf{x}\,\bar{\nu}(\mathrm{d}\mathbf{c})\right)-\sum_{s\in S}p_{s}\int_{\mathcal{C}}\mathbf{c}^{\top}\tilde{\mathbf{g}}_{s}\mathbf{y}_{s}[\bar{\nu}(\omega_{s})](\mathrm{d}\mathbf{c})+U(\mathbf{x},\mathbf{y},\bar{\mu})\geq$ $-\mathbbm{E}\left(\int_{\mathcal{C}}\mathbf{c}^{\top}\bar{\mathbf{g}}\mathbf{x}^{*}\,\bar{\nu}(\mathrm{d}\mathbf{c})\right)-\sum_{s\in S}p_{s}\int_{\mathcal{C}}\mathbf{c}^{\top}\tilde{\mathbf{g}}_{s}\mathbf{y}^{*}_{s}[\bar{\nu}(\omega_{s})](\mathrm{d}\mathbf{c})+U(\mathbf{x}^{*},\mathbf{y}^{*},\bar{\mu})$ is valid for any primal feasible solution $(\mathbf{x},\mathbf{y})$ (and equality holds for $(\mathbf{x}^{*},\mathbf{y}^{*})$ ). Accordingly, we prove that the following chain of inequalities holds for any $(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})$ , and that all inequalities hold with equality for $(\mathbf{x},\mathbf{y})=(\mathbf{x}^{*},\mathbf{y}^{*})$ .

[TABLE]

We note that an alternative definition of CVaR for a random variable $V$ is given by $\operatorname{CVaR}_{\alpha}(V)=\operatorname{VaR}_{\alpha}(V)+\dfrac{1}{1-\alpha}\mathbbm{E}([V-\operatorname{VaR}_{\alpha}(V)]_{+})$ , where $[z]_{+}=\max(z,0)$ Rockafellar and Uryasev (2000). Substituting this definition into the third term and expanding the expected value in the first term (34) becomes (35). Using the dual feasibility condition (27b) to replace $\bar{\mu}$ in the second term of (35) provides (36). Then, (37) follows from (27c); equality for $(\mathbf{x},\mathbf{y})=(\mathbf{x}^{*},\mathbf{y}^{*})$ is ensured by the complementary slackness condition (iii). Finally, (38) is a consequence of the trivial inequality $[a]_{+}\geq a$ and the non-negativity of the random measure $\nu$ . Equality is again ensured for $(\mathbf{x},\mathbf{y})=(\mathbf{x}^{*},\mathbf{y}^{*})$ , since the complementary slackness condition (ii) implies that $\mathbf{c}^{\top}\hat{\mathbf{g}}_{s}(\mathbf{x},\mathbf{y}_{s})-\operatorname{VaR}_{\alpha}(\mathbf{c}^{\top}\hat{\mathbf{G}}(\mathbf{x},\mathbf{y}))\geq 0$ holds on the support of $\bar{\nu}(\omega_{s})$ for all $s\in S$ .

Finally, let us consider a primal optimal solution $(\mathbf{x}^{*},\mathbf{y}^{*})$ and a dual optimal solution $\mu^{*}$ . We just showed above that there exists some dual feasible $\bar{\mu}$ such that $U(\mathbf{x}^{*},\mathbf{y}^{*},\bar{\mu})=U(\mathbf{Z},\bar{\mu})$ and $\mathcal{L}(\mathbf{x}^{*},\mathbf{y}^{*},\bar{\mu})=\min_{(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})}\mathcal{L}(\mathbf{x},\mathbf{y},\bar{\mu})=\mathbf{f}^{\top}\mathbf{x}^{*}+\sum_{s\in S}p_{s}\mathbf{q}^{\top}_{s}\mathbf{y}^{*}_{s}$ . As the set of feasible primal solutions is compact and the Lagrangian function $\mathcal{L}$ is continuous, there also exists some $(\hat{\mathbf{x}},\hat{\mathbf{y}})$ such that $\mathcal{L}(\hat{\mathbf{x}},\hat{\mathbf{y}},\mu^{*})=\min_{(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})}\mathcal{L}(\mathbf{x},\mathbf{y},\mu^{*})$ . Then, by the optimality of $\mu^{*}$ , we have $\mathbf{f}^{\top}\mathbf{x}^{*}+\sum_{s\in S}p_{s}\mathbf{q}^{\top}_{s}\mathbf{y}^{*}_{s}=\min_{(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})}\mathcal{L}(\mathbf{x},\mathbf{y},\bar{\mu})\leq\min_{(\mathbf{x},\mathbf{y})\in(\mathcal{X},\mathcal{Y})}\mathcal{L}(\mathbf{x},\mathbf{y},\mu^{*})=\mathcal{L}(\hat{\mathbf{x}},\hat{\mathbf{y}},\mu^{*})$ , and consequently, the following relation holds:

[TABLE]

On the other hand, by the primal feasibility of $(\mathbf{x}^{*},\mathbf{y}^{*})$ and the non-negativity of $\mu^{*}$ , $U(\mathbf{x}^{*},\mathbf{y}^{*},\mu^{*})-U(\mathbf{Z},\mu^{*})\leq 0$ holds, which immediately implies (30) and (31).

∎

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ahmed (2006) Ahmed, S.: Convexity and decomposition of mean-risk stochastic programs 106 (3), 433–446 (2006). DOI http://dx.doi.org/10.1007/s 10107-005-0638-8
2Artzner et al. (1999) Artzner, P., Delbaen, F., Eber, J., Heath, D.: Coherent measures of risk. Mathematical Finance 9 (3), 203–228 (1999)
3Balcik and Beamon (2008) Balcik, B., Beamon, B.: Facility location in humanitarian relief. International Journal of Logistics: Research and Applications 11 (2), 101–121 (2008)
4Bodur and Luedtke (2016) Bodur, M., Luedtke, J.R.: Mixed-integer rounding enhanced benders decomposition for multiclass service-system staffing and scheduling with arrival rate uncertainty. Management Science, in press (2016). DOI 10.1287/mnsc.2016.2455
5Bonnans and Shapiro (2000) Bonnans, J.F., Shapiro, A.: Perturbation analysis of optimization problems. Springer Series in Operations Research. Springer-Verlag, New York (2000)
6Çelik et al. (2012) Çelik, M., Ergun, O., Johnson, B., Keskinocak, P., Lorca, A., Pekgün, P., Swann, J.: Humanitarian logistics. In: Mirchandani, P. (ed.) INFORMS Tutorials in Operations Research, pp. 18–49. INFORMS, Maryland (2012)
7Dentcheva and Martinez (2012) Dentcheva, D., Martinez, G.: Two-stage stochastic optimization problems with stochastic ordering constraints on the recourse. European Journal of Operational Research 219 (1), 1 – 8 (2012)
8Dentcheva and Ruszczyński (2003) Dentcheva, D., Ruszczyński, A.: Optimization with stochastic dominance constraints. SIAM Journal on Optimization 14 (2), 548–566 (2003)