An Exact Method for Constrained Maximization of the Conditional   Value-at-Risk of a Class of Stochastic Submodular Functions

Hao-Hsiang Wu; Simge Kucukyavuz

arXiv:1903.08318·math.OC·April 17, 2020·Oper. Res. Lett.

An Exact Method for Constrained Maximization of the Conditional Value-at-Risk of a Class of Stochastic Submodular Functions

Hao-Hsiang Wu, Simge Kucukyavuz

PDF

Open Access

TL;DR

This paper introduces an exact method for maximizing the CVaR of stochastic submodular functions under constraints, providing a new approach for risk-averse optimization with practical applications.

Contribution

It develops valid inequalities and an exact solution method for constrained CVaR maximization of stochastic submodular functions, assuming efficient CVaR oracle access.

Findings

01

Effective solution method demonstrated on stochastic set covering problem.

02

Valid inequalities improve computational efficiency.

03

Method applicable to problems with efficient CVaR oracles.

Abstract

We consider a class of risk-averse submodular maximization problems (RASM) where the objective is the conditional value-at-risk (CVaR) of a random nondecreasing submodular function at a given risk level. We propose valid inequalities and an exact general method for solving RASM under the assumption that we have an efficient oracle that computes the CVaR of the random function. We demonstrate the proposed method on a stochastic set covering problem that admits an efficient CVaR oracle for the random coverage function.

Tables1

Table 1. Table 1: Algorithm 1 with different inequalities

			Oracle-LShape			Oracle-Ineq (7)			Oracle-Ineq (12)			Oracle-Ineq (7) and (12)
$\| 𝒱 \|$	$α$	$k$	Time	Cuts	Nodes	Time	Cuts	Nodes	Time	Cuts	Nodes	Time	Cuts	Nodes
50	0.025	3	9	2315	2363	3	801	1915	$\leq 1$	224	336	$\leq 1$	398	667
50	0.025	5	$\geq 1800$	33988	36405	681	16683	52677	3	726	1960	5	1462	2264
50	0.05	3	10	2319	2363	2	799	1734	$\leq 1$	225	366	$\leq 1$	397	787
50	0.05	5	$\geq 1800$	32548	36569	492	14978	43320	3	713	1859	5	1484	2095
100	0.025	3	1152	19661	20300	11	1845	4287	24	1511	2321	15	1816	1252
100	0.025	5	$\geq 1800$	21834	53054	1154	17741	97510	825	13156	46789	560	15648	45091
100	0.05	3	1205	19634	20685	9	1486	4476	24	1542	2400	15	1808	1224
100	0.05	5	$\geq 1800$	21980	52479	505	11408	74808	826	12911	47823	238	10684	21191
150	0.025	3	$\geq 1800$	18922	22045	25	1832	12753	457	6775	10868	91	3046	1389
150	0.025	5	$\geq 1800$	19364	43796	1640	17781	109795	$\geq 1800$	13468	72593	1603	20278	92923
150	0.05	3	$\geq 1800$	20394	21649	6	1021	1115	445	6705	10815	59	1802	8489
150	0.05	5	$\geq 1800$	20609	49523	$\geq 1800$	10832	150285	$\geq 1800$	14759	68158	1582	19978	75957

Equations51

\text{VaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta:\mathbb{P}(\sigma(\bar{x})\geq\eta)\geq 1-\alpha,\eta\in\mathbb{R}\Big{\}}.

\text{VaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta:\mathbb{P}(\sigma(\bar{x})\geq\eta)\geq 1-\alpha,\eta\in\mathbb{R}\Big{\}}.

\text{CVaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta-\frac{1}{\alpha}\mathbb{E}([\eta-\sigma(\bar{x})]_{+}):\eta\in\mathbb{R}\Big{\}},

\text{CVaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta-\frac{1}{\alpha}\mathbb{E}([\eta-\sigma(\bar{x})]_{+}):\eta\in\mathbb{R}\Big{\}},

x \in X max CVaR_{α} (σ (x)) .

x \in X max CVaR_{α} (σ (x)) .

σ (x) \leq σ (S) + j \in V ∖ S \sum (σ ({j} \cup S) - σ (S)) x_{j}, \forall S \subseteq V,

σ (x) \leq σ (S) + j \in V ∖ S \sum (σ ({j} \cup S) - σ (S)) x_{j}, \forall S \subseteq V,

\text{CVaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta-\frac{1}{\alpha}\sum_{i\in[N]}p_{i}w_{i}:w_{i}\geq\eta-\sigma_{i}(\bar{x}),\quad\forall i\in[N],w\in\mathbb{R}^{N}_{+},\eta\in\mathbb{R}\Big{\}},

\text{CVaR}_{\alpha}(\sigma(\bar{x}))=\max\Big{\{}\eta-\frac{1}{\alpha}\sum_{i\in[N]}p_{i}w_{i}:w_{i}\geq\eta-\sigma_{i}(\bar{x}),\quad\forall i\in[N],w\in\mathbb{R}^{N}_{+},\eta\in\mathbb{R}\Big{\}},

max {ψ : (x, ψ) \in C, x \in X, ψ \in R},

max {ψ : (x, ψ) \in C, x \in X, ψ \in R},

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}.

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}.

\hat{ψ}

\hat{ψ}

\leq CVaR_{1} (σ (\overset{x}{^}))

\leq CVaR_{1} (σ (\overset{x}{ˉ})) + j \in V ∖ \overset{ˉ}{X} \sum [CVaR_{1} (σ (\overset{x}{ˉ} + e_{j})) - CVaR_{1} (σ (\overset{x}{ˉ}))] \overset{x}{^}_{j}

= j \in V ∖ \overset{ˉ}{X} \sum CVaR_{1} (σ (\overset{x}{ˉ} + e_{j})) \overset{x}{^}_{j} - j \in V ∖ (\overset{ˉ}{X} \cup {j^{'}}) \sum CVaR_{1} (σ (\overset{x}{ˉ})) \overset{x}{^}_{j}

\leq j \in V ∖ \overset{ˉ}{X} \sum CVaR_{1} (σ (\overset{x}{ˉ} + e_{j})) \overset{x}{^}_{j} - j \in V ∖ (\overset{ˉ}{X} \cup {j^{'}}) \sum CVaR_{α} (σ (\overset{x}{ˉ})) \overset{x}{^}_{j}

\displaystyle=\bigg{(}\sum_{j\in V\setminus\bar{X}}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))\hat{x}_{j}-\sum_{j\in V\setminus\bar{X}}\text{CVaR}_{\alpha}(\sigma(\bar{x}))\hat{x}_{j}\bigg{)}+\text{CVaR}_{\alpha}(\sigma(\bar{x}))\hat{x}_{j^{\prime}}

= CVaR_{α} (σ (\overset{x}{ˉ})) + j \in V ∖ \overset{ˉ}{X} \sum (CVaR_{1} (σ (\overset{x}{ˉ} + e_{j})) - CVaR_{α} (σ (\overset{x}{ˉ}))) \overset{x}{^}_{j} .

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) + i = 1 \sum r δ_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}},

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) + i = 1 \sum r δ_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}},

δ_{j_{t}} (\overset{x}{ˉ}) = - CVaR_{α} (σ (\overset{x}{ˉ})) + max

δ_{j_{t}} (\overset{x}{ˉ}) = - CVaR_{α} (σ (\overset{x}{ˉ})) + max

x_{j_{t}} = 1

x_{j_{i}} = 0, i = t + 1, \dots, r

x \in X .

\overset{ˉ}{δ}_{j_{t}} (\overset{x}{ˉ}) = max {CVaR_{α} (σ (x)), \eqref e q : u pl i f t_{1}, \eqref e q : u pl i f t_{3}, x \in B^{n}} - CVaR_{α} (σ (\overset{x}{ˉ}));

\overset{ˉ}{δ}_{j_{t}} (\overset{x}{ˉ}) = max {CVaR_{α} (σ (x)), \eqref e q : u pl i f t_{1}, \eqref e q : u pl i f t_{3}, x \in B^{n}} - CVaR_{α} (σ (\overset{x}{ˉ}));

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}},

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}},

ψ

ψ

\leq CVaR_{α} (σ (\overset{x}{ˉ})) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) \overset{x}{^}_{j_{i}} .

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))\leq\text{CVaR}_{\alpha}(\sigma(x))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j},

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))\leq\text{CVaR}_{\alpha}(\sigma(x))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j},

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) \leq CVaR_{α} (σ (x)) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}} .

ψ \leq CVaR_{α} (σ (\overset{x}{ˉ})) \leq CVaR_{α} (σ (x)) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}} .

\max\{\psi:\psi\leq\text{CVaR}_{\alpha}(\sigma(x))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}\quad\forall\bar{X}\subseteq V,x\in\mathcal{X},\psi\in\mathbb{R}\}

\max\{\psi:\psi\leq\text{CVaR}_{\alpha}(\sigma(x))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}\quad\forall\bar{X}\subseteq V,x\in\mathcal{X},\psi\in\mathbb{R}\}

max {ψ : ψ \leq CVaR_{α} (σ (x)) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}} \forall \overset{ˉ}{X} \subseteq V, x \in X, ψ \in R},

max {ψ : ψ \leq CVaR_{α} (σ (x)) + i = 1 \sum r \overset{ˉ}{δ}_{j_{i}} (\overset{x}{ˉ}) x_{j_{i}} \forall \overset{ˉ}{X} \subseteq V, x \in X, ψ \in R},

max {CVaR_{α} (σ (x)) : i \in V_{1} \sum x_{i} \leq k, x \in B^{n}},

max {CVaR_{α} (σ (x)) : i \in V_{1} \sum x_{i} \leq k, x \in B^{n}},

CVaR_{α} (σ (x))

CVaR_{α} (σ (x))

= VaR_{α} (σ (x)) 1 - \frac{1}{α} j = 0 \sum VaR_{α} (σ (x)) - 1 A (x, m, j)) + \frac{1}{α} j = 0 \sum VaR_{α} (σ (x)) - 1 j A (x, m, j) .

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(V))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}.

\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))+\sum_{j\in V\setminus\bar{X}}\big{(}\text{CVaR}_{1}(\sigma(V))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}x_{j}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Probabilistic and Robust Engineering Design · Complexity and Algorithms in Graphs

Full text

An Exact Method for Constrained Maximization of the Conditional Value-at-Risk of a Class of Stochastic Submodular Functions

@addressi

@addressii

@addressiii

@addressiv

@addressv

Abstract

We consider a class of risk-averse submodular maximization problems (RASM) where the objective is the conditional value-at-risk (CVaR) of a random nondecreasing submodular function at a given risk level. We propose valid inequalities and an exact general method for solving RASM under the assumption that we have an efficient oracle that computes the CVaR of the random function. We demonstrate the proposed method on a stochastic set covering problem that admits an efficient CVaR oracle for the random coverage function.

1 Introduction

11footnotetext: Corresponding author

We consider a class of risk-averse submodular maximization problems (RASM) recently formulated by Maehara [13]. Formally, let $V=\{1,\dots,n\}$ be a finite set, and $\bar{\varOmega}$ be a probability space. We define an outcome mapping $\sigma:2^{|V|}\times\bar{\varOmega}\rightarrow\mathbb{R}$ . The random outcome $\sigma(X):\bar{\varOmega}\rightarrow\mathbb{R}$ is defined by $\sigma(X)(\omega)=\sigma(X,\omega)$ for all $X\subseteq{V}$ and $\omega\in\bar{\varOmega}$ . We assume that $\sigma(X)(\omega)$ is a nondecreasing submodular set function. With a slight abuse of notation, we refer to the submodular function $\sigma(X)$ for $X\subseteq V$ as $\sigma(x)$ for $x\in\{0,1\}^{n}$ interchangeably, where $x$ is the characteristic vector of $X$ , and the usage will be clear from the context. We measure the risk associated with this function using conditional value-at-risk (CVaR), where larger function values correspond to less risky random outcomes. In this context, risk measures are referred to as acceptability functionals. CVaR was first introduced by Artzner et al. [3] and has been widely used, especially in finance, due to its desirable properties (e.g., coherence and tractability).

Formally, let $[z]_{+}=\max(z,0)$ be the positive part of a number $z\in\mathbb{R}$ . Let $\eta$ be a variable that measures the value-at-risk (VaR) at a given risk level. For a given $\bar{x}$ , the value-at-risk at a risk level $\alpha\in(0,1]$ is defined as

[TABLE]

For a given $\bar{x}$ , the conditional value-at-risk at a risk level $\alpha\in(0,1]$ is defined as

[TABLE]

and it provides the conditional expected value of $\sigma(\bar{x})$ that is no larger than the value-at-risk at the risk level $\alpha$ [17]. In general, the risk level $\alpha$ is small, such as $\alpha=0.05$ or $0.01$ . Let $\mathcal{X}$ be the deterministic constraints on the variables $x$ . Given a risk level $\alpha\in(0,1]$ , a class of risk-averse submodular maximization problems (RASM) is defined as

[TABLE]

Let $\mathbb{E}[\sigma(x)]$ represent the expectation of $\sigma(x)$ with respect to $\omega$ . In the next observation, we review properties of CVaR that will be useful in developing valid inequalities for RASM.

Observation 1.1

From the definition of CVaR in (2), for a given $\bar{x}\in\mathcal{X}$ , we have

(i)

$\text{CVaR}_{1}(\sigma(\bar{x}))=\mathbb{E}[\sigma(\bar{x})]$ , and because $\mathbb{E}[\sigma(\bar{x})]$ is submodular (see, e.g., **[24]**), so is $\text{CVaR}_{1}(\sigma(\bar{x}))$ .

(ii)

$\text{CVaR}_{\alpha_{1}}(\sigma(\bar{x}))\leq\text{CVaR}_{\alpha_{2}}(\sigma(\bar{x}))$ * for $\alpha_{1},\alpha_{2}\in(0,1]$ and $\alpha_{1}\leq\alpha_{2}$ .*

Due to Observation 1.1(i), when $\alpha=1$ , problem (3) is equivalent to the $\mathcal{NP}$ -hard risk-neutral submodular maximization problem $\max_{x\in\mathcal{X}}\mathbb{E}[\sigma(x)]$ (see, e.g., [10, 11, 20] for the associated applications).

Note that for a nondecreasing submodular function $\sigma(\cdot)$ , Proposition 2 of [14] shows that the submodular inequality

[TABLE]

is valid. Ahmed and Atamtürk [1] and Yu and Ahmed [26] strengthen the submodular inequalities by lifting, under the assumption that the submodular utility function is strictly concave, increasing, and differentiable, assumptions we do not make in this paper. However, Maehara [13] shows that CVaR of a stochastic submodular function is no longer submodular for any risk level $\alpha\in(0,1)$ . The author proves that unless $\mathcal{P}=\mathcal{NP}$ , there is no polynomial-time approximation algorithm with a multiplicative error for RASM under some reasonable assumptions on the given risk level. Hence it is theoretically intractable to find a solution that is close to optimal in polynomial time without any assumption on the feasible region or the probability distribution. Along this line of work, Zhou and Tokekar [28] propose a sequential greedy algorithm for the CVaR maximization problem in [13]. The authors show that the proposed algorithm gives a solution that is within a constant factor of optimal with an additional additive term that depends on the optimal value and a parameter related to the curvature of the submodular set function. However, the running time of this algorithm is exponential as it depends quadratically on the cardinality of the feasible set.

Instead of solving the generic CVaR maximization problem in [13] directly, Ohsaka and Yoshida [16] consider a specific risk-averse submodular optimization problem arising in social networks. The authors relax the problem by replacing the combinatorial decisions that determine a single set of a predetermined size with a choice of a portfolio (convex combination) of sets of the given size. They also give a polynomial-time algorithm that obtains a portfolio with a CVaR value that has a provable guarantee with a specified probability. Wilder [22] generalizes these results and proposes an approximation algorithm for maximizing the CVaR of a generic monotone continuous submodular function (or a portfolio of dicrete sets) with a worst-case guarantee within $(1-\frac{1}{e})$ factor of the optimal solution. That is, a function $F:\bar{V}\rightarrow\mathbb{R}$ is continuous submodular if and only if $F(a)+F(b)\geq F(a\vee b)+F(a\wedge b)$ for any $a,b\in\bar{V}\subseteq\mathbb{R}_{+}^{n}$ , where $\bar{V}_{i}$ is a compact subset of $\mathbb{R}$ , $\bar{V}=\prod_{i=1}^{n}\bar{V}_{i}\subseteq\mathbb{R}_{+}^{n}$ , and notations $\vee$ and $\wedge$ denote the coordinate-wise minimum and maximum operations, respectively (see, e.g., [6] and [7]). In contrast to this line of work that considers a portfolio of discrete sets as decision variables, we consider the discrete case, where we must choose one set of decisions for implementability purposes. In another line of work, Wu and Küçükyavuz [25] propose an exact method for the minimal cost selection of $x\in\mathcal{X}$ that satisfies the chance constraint $\mathbb{P}(\sigma(x)\geq\tau)$ for a given $\tau$ , where the authors assume that there exists a probability oracle for evaluating the chance constraint for a given solution. However, the exact method of [25] cannot be applied to the CVaR maximization problem (or its VaR maximization counterpart) directly.

Note that for a given $\bar{x}\in\mathcal{X}$ and a finite probability space $(\varOmega,2^{\varOmega},\mathbb{P})$ with a set of $N\in\mathbb{N}$ realizations (scenarios) $\varOmega=\{\omega_{1},\dots,\omega_{N}\}$ , $\mathbb{P}(\omega_{i})=p_{i}$ for $i=1,\dots N$ , CVaR is given by [17]

[TABLE]

where $[N]=\{1,\dots,N\}$ represents the set of the first $N$ positive integers, $\sigma_{i}(\bar{x})=\sigma(x)(\omega_{i})$ , and $\sigma_{q}(\bar{x})=\text{VaR}_{\alpha}(\sigma(\bar{x}))$ for at least one $\omega_{q}\in\varOmega$ . Therefore, when an oracle for evaluating CVaR is not available, we can take a sampling-based approach. To this end, in [23], we apply the stochastic submodular optimization methods of [24] to problem (3) based on the CVaR definition (5). The resulting solutions are then tested out-of-sample and statistical performance guarantees are provided.

In this paper, our goal is to give (near-)optimal solutions for RASM without sampling. We start by considering high-quality (ideally optimal) feasible solutions to RASM under a true (non-trivial) distribution of the uncertain parameters assuming that there is an efficient oracle to evaluate the true value of CVaR of the random function at a given risk level. To solve RASM, we propose a decomposition method with various classes of valid inequalities. We demonstrate our proposed methods on a risk-averse set covering problem (RASC) that admits an efficient CVaR oracle in our computational study.

2 RASM with an Exact CVaR Oracle

In this section, we assume that for a given incumbent solution $\bar{x}\in\mathcal{X}$ , we have an efficient oracle that computes $\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ exactly. Under this assumption, we solve problem (3) without sampling by using a two-stage optimization model and an exact algorithm with various valid inequalities. In the proposed method, we solve a relaxed master problem (RMP) at any iteration in the form

[TABLE]

where $\psi$ is a variable that is an upper bounding approximation of $\text{CVaR}_{\alpha}(\sigma(x))$ for a solution $x\in\mathcal{X}$ , and $\mathcal{C}$ is a set of optimality cuts to be defined later. It is an approximation in that not all necessary optimality cuts may have been generated at an intermediate iteration. A full master problem includes all optimality cuts in $\mathcal{C}$ such that for any $x\in\mathcal{X}$ , we have $\psi\leq\text{CVaR}_{\alpha}(\sigma(x))$ . Therefore, solving the full master problem is equivalent to solving the original problem (3). In Algorithm 1, we propose a delayed constraint generation algorithm for solving problem (3) using RMP (6). The algorithm starts with a set of optimality cuts $\mathcal{C}$ (could be empty). In the while loop, we solve RMP (6) and obtain an incumbent solution (Line 1). Based on the incumbent solution, we add an optimality cut to RMP (6) (Line 1). In this algorithm, $\epsilon$ is a user-defined optimality tolerance. Let UB be the upper bound obtained from the optimal objective value of RMP (6) at each iteration. Let LB be the lower bound equal to $\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ , obtained from calling the CVaR oracle with the given incumbent solution $\bar{x}\in\mathcal{X}$ as input. If the optimality gap is below $\epsilon$ , then we terminate the algorithm and return the near-optimal solution.

As we mentioned earlier, $\text{CVaR}_{\alpha}(\sigma(x))$ is not submodular in $x$ even if $\sigma(x)$ is submodular [13]. Hence, there is no direct way use the submodular inequality (4) as a class of optimality cuts in $\mathcal{C}$ . Therefore, in this section, we propose new optimality cuts that are valid for (6). Throughout, we let $\mathbf{e}_{j}$ be a unit vector of dimension $|V|$ whose $j$ th component is 1, and let $\mathbf{1}$ be a $|V|$ -dimensional vector with all entries equal to 1. For a given $\bar{x}\in\mathcal{X}$ , we first propose an optimality cut given by

[TABLE]

Before formally proving the validity of inequality (7), a few remarks are in order. It may be tempting to think that inequality (7) is in the form of a submodular inequality (4). However, we highlight that the coefficients $\big{(}\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}$ for $j\in V\setminus\bar{X}$ are different from their submodular counterparts $\big{(}\text{CVaR}_{\alpha}(\sigma(\bar{x}+\mathbf{e}_{j}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}$ for $j\in V\setminus\bar{X}$ and because $\text{CVaR}_{\alpha}(\sigma(\cdot))$ is not submodular, the latter coefficients are not valid. To provide some intuition for the validity of the coefficients of the variables $x_{j},j\in V\setminus\bar{X}$ in inequality (7), we consider the upper bound on $\psi$ provided by the right-hand side of the inequality (recall that $\psi\leq\text{CVaR}_{\alpha}(\sigma(x))$ for all $x\in\mathcal{X}$ ). First, observe that for $x=\bar{x}$ , inequality (7) holds trivially, because $x_{j}=0$ for $j\in V\setminus\bar{X}$ . Now consider the point $x=\bar{x}+\mathbf{e}_{j}$ for some $j\in V\setminus\bar{X}$ . Then the right-hand side of inequality (7) is $\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))$ , which is a valid upper bound on $\psi$ (because $\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}+\mathbf{e}_{j}))\leq\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))$ from Observation 1.1 (ii)). The validity for other $x$ is proven by exploiting the properties of $\text{CVaR}_{\alpha}(\sigma(x))$ given in Observation 1.1 as we show next.

Proposition 2.1

Inequality (7) for a given $\bar{x}\in\mathbb{B}^{n}$ is valid for RMP (6).

Proof.

Consider a feasible point $(\hat{\psi},\hat{x})$ to the full master problem, in other words, let $\hat{x}\in\mathcal{X}$ such that $\hat{\psi}\leq\text{CVaR}_{\alpha}(\sigma(\hat{x}))$ . We show that $(\hat{\psi},\hat{x})$ satisfies inequality (7) written for any $\bar{x}\in\mathcal{X}$ . Let $\hat{X}=\{i\in V:{\hat{x}_{i}}=1\}$ . From the definition of CVaR in (2), it follows that because $\sigma(x)(\omega)$ is nondecreasing in $x$ for all $\omega\in\bar{\varOmega}$ , $\text{CVaR}_{\alpha}(\sigma(x))$ is also nondecreasing in $x$ . For the case that $\hat{X}\subseteq\bar{X}$ , since $\text{CVaR}_{\alpha}(\sigma(x))$ is a monotonically nondecreasing function in $x$ and $\hat{x}_{j}=0$ for all $j\in V\setminus\bar{X}$ , we have $\hat{\psi}\leq\text{CVaR}_{\alpha}(\sigma(\hat{x}))\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ , which shows that inequality (7) is valid for $\hat{X}\subseteq\bar{X}$ . For the case that $\hat{X}\setminus\bar{X}\neq\emptyset$ , we select an arbitrary $j^{\prime}\in\hat{X}\setminus\bar{X}$ , where $\hat{x}_{j^{\prime}}=1$ . Then

[TABLE]

Inequality (8a) follows from Observation 1.1 (ii) that $\text{CVaR}_{\alpha}(\sigma(x))$ is nondecreasing in $\alpha$ . Inequality (8b) follows from the submodular inequality (4) written for the monotone submodular function $\text{CVaR}_{1}(\sigma(x))$ (see Observation 1.1 (i)) and for $S=\bar{X}$ evaluated at the point $x=\hat{x}$ . Arranging terms in inequality (8b) and recalling that $\hat{x}_{j^{\prime}}=1$ , we obtain equality (8c). Inequality (8d) follows from Observation 1.1 (ii), this time observing that $-\text{CVaR}_{\alpha}(\sigma(x))$ is nonincreasing in $\alpha$ . To obtain equality (8e), we add $\text{CVaR}_{\alpha}(\sigma(\bar{x}))\hat{x}_{j^{\prime}}-\text{CVaR}_{\alpha}(\sigma(\bar{x}))\hat{x}_{j^{\prime}}$ to inequality (8d) and reorganize the terms. Finally, equality (8f) follows from $\hat{x}_{j^{\prime}}=1$ . This completes the proof. ∎

Next, we introduce a class of valid inequalities obtained by a sequential lifting procedure [15, Proposition 1.1 in Section II.2.1]. Given $\bar{x}\in\mathcal{X}$ , which is a characteristic vector of the set $\bar{X}$ , consider a restriction with $x_{j}=0$ for $j\in V\setminus\bar{X}$ . For this restriction, we know that the base inequality $\psi\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ is valid, because for any $x$ satisfying this restriction, we have $X\subseteq\bar{X}$ , and $\psi=\text{CVaR}_{\alpha}(\sigma(x))\leq\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ due the property that $\text{CVaR}_{\alpha}(\sigma(x))$ is monotonically nondecreasing in $x$ . However, this base inequality is not valid when the restriction is lifted. To obtain a valid inequality, we sequentially lift the base inequality with the variables $x_{j},j\in V\setminus\hat{X}$ . Let $j_{1},\dots,j_{r}$ be an ordering of the elements in $V\setminus\bar{X}$ , where $r=|V\setminus\bar{X}|$ . Sequential lifting following this order produces a valid inequality

[TABLE]

where $\delta_{j_{t}}(\bar{x})$ for $t=1,\dots,r$ is an exact lifting function given by the $t$ -th lifting problem

[TABLE]

In (10a), we use the convention that for $t=1$ , the term $\sum_{i=1}^{t-1}(\cdot)=0$ . Note that in the $t$ -th lifting problem, to obtain the coefficient of the variable $x_{j_{t}}$ in inequality (9), we let $x_{j_{t}}=1$ in constraint (10b), we remove the restriction on the variables preceding this variable in the lifting sequence, while keeping the restriction that the variables following $x_{j_{t}}$ in the lifting sequence are fixed to zero in constraint (10c). The exact lifting problem is hard to solve since it is related to the submodular maximization problem (3). Instead of solving problem (10) exactly, we propose to solve a relaxation that provides an upper bound for the lifting coefficients. Then, we can obtain another valid inequality by using the upper bounds on the coefficients instead of the exact coefficients. We describe this approach next.

Given $\bar{x}\in\mathcal{X}$ , we define $\bar{\delta}_{j_{t}}(\bar{x})$ as an upper bound of $\delta_{j_{t}}(\bar{x})$ for $t=1,\dots,r$ , where $\bar{\delta}_{j_{t}}(\bar{x})$ is given by solving the following relaxation of the exact lifting problem (10):

[TABLE]

here we relax constraints (10d) and remove the term $-\sum_{i=1}^{t-1}\delta_{j_{i}}(\bar{x})x_{j_{i}}$ from the objective function in problem (10). This is a valid relaxation, because $\delta_{j_{i}}(x)\geq 0$ when $\text{CVaR}_{\alpha}(\sigma(x))$ is monotonically nondecreasing in $x$ . Furthermore, because $x_{j_{t}}=1$ and $x_{j_{i}}=0$ for $i=t+1,\dots,r$ , the feasible solution with $x^{*}_{j_{i}}=1$ for $i=1,\dots,t$ has the largest $\text{CVaR}_{\alpha}(\sigma(x^{*}))$ in the objective function of (11). Thus, the optimal solution of the relaxed lifting problem (11) is given by $\bar{x}^{j_{t}}=\mathbf{1}-\sum_{i=t+1}^{r}\mathbf{e}_{j_{i}}$ , where we can use an efficient oracle for $\text{CVaR}_{\alpha}(\sigma(x))$ to obtain the optimal value of $\bar{\delta}_{j_{t}}(\bar{x})$ efficiently.

Proposition 2.2

Given $\bar{x}\in\mathcal{X}$ , its support $\bar{X}$ and an ordering of $V\setminus\bar{X}$ given by $\{j_{1},\ldots,j_{r}\}$ , inequality

[TABLE]

where $\bar{\delta}_{j_{i}}(\bar{x})=\text{CVaR}_{\alpha}(\sigma(\bar{x}^{j_{i}}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ , is valid for RMP (6).

Proof.

Consider a feasible point $(\hat{\psi},\hat{x})$ . We show that $(\hat{\psi},\hat{x})$ satisfies inequality (12) written for $\bar{x}\in\mathbb{B}^{n}$ .

[TABLE]

Inequality (13a) follows from the valid inequality (9). Inequality (13b) follows from $\bar{\delta}_{j_{i}}(\bar{x})\geq\delta_{j_{i}}(\bar{x})$ for $i=1,\dots,r$ since problem (11) is a relaxation of problem (10). This completes the proof. ∎

While inequality (12) is derived by approximate lifting, next we provide some intuition on the validity of the coefficients of the variables $x_{j},j\in V\setminus\bar{X}$ in inequality (12). Consider the upper bound on $\psi\leq\text{CVaR}_{\alpha}(\sigma(x))$ for some $x\in\mathcal{X}$ provided by the right-hand side of the inequality. First, observe that for $x=\bar{x}$ , inequality (12) holds trivially, because $x_{j}=0$ for $j\in V\setminus\bar{X}$ . Now consider the point $x=\bar{x}+\mathbf{e}_{j_{t}}$ , for some $t\in\{1,\dots,r\}$ . Then the right-hand side of inequality (12) is $\text{CVaR}_{\alpha}(\sigma(\bar{x}^{j_{t}}))$ , which is a valid upper bound on $\psi=\text{CVaR}_{\alpha}(\sigma(\bar{x}+\mathbf{e}_{j_{t}}))$ from Observation 1.1 (ii). The approximate lifting argument establishes that the inequality is valid for other choices of $x$ as well, because the coefficients of the variables are no smaller than those obtained from an exact lifting problem.

In Algorithm 2, we propose a greedy method that generates an inequality (12) given an incumbent $\bar{x}$ . In the for loop, Line 2 determines the entry $j_{i}$ by choosing the candidate index $s$ for which $\bar{\delta}_{j_{i}=s}(\bar{x})$ attains its smallest value, where the candidate $s$ has not been chosen previously.

Finally, we show the correctness of Algorithm 1 based on the inequalities (7) or (12).

Proposition 2.3

Algorithm 1 with optimality cuts (7) and/or (12) finitely converges to an optimal solution.

Proof.

From Proposition 2.1 and 2.2, for all $\bar{X}\subseteq V$ and $x\in\mathcal{X}$ , we have

[TABLE]

and

[TABLE]

Hence, the following two linear integer programs,

[TABLE]

and

[TABLE]

are equivalent to Problem (3). Since the number of feasible solutions and the number of inequalities (7) and (12) are finite, the result follows. ∎

Next we report our computational experience with the proposed method.

3 An Application: the Risk-Averse Set Covering Problem (RASC)

We demonstrate the proposed methods on a variant of a risk-averse set covering problem (RASC). We represent RASC on a bipartite graph $G=(V_{1}\cup V_{2},E)$ . There are two groups of nodes $V_{1}$ and $V_{2}$ in $G$ , where all arcs in $E$ are from $V_{1}$ to $V_{2}$ . Let $V_{2}:=\{1,\dots,m\}$ be a set of items. Let $S_{j}\subseteq V_{2},j\in V_{1}:=\{1,\dots,n\}$ be a collection of $n$ subsets, where $\bigcup_{j=1}^{n}S_{j}=V_{2}$ . There exists an arc $(i,j)\in E$ if $j\in S_{i}$ for $i\in V_{1}$ representing the covering relationship. In RASC, there is uncertainty on whether an arc appears in the graph. To formulate RASC, first consider a deterministic set covering problem with a feasibility set $\{x\in\mathbb{B}^{n}|\sum_{j\in V_{1}}t_{ij}x_{j}\geq h_{i},\quad\forall i\in V_{2}\}$ , where $h_{i}=1$ for all $i\in V_{2}$ , and $t_{ij}=1$ if $i\in S_{j}$ ; otherwise, $t_{ij}=0$ for all $i\in V_{2}\setminus S_{j},j\in V_{1}$ . We say that an item $i\in V_{2}$ is covered, if there exists $x_{j}=1$ for $j\in V_{1}$ and $t_{ij}=1$ . Now suppose that there is uncertainty on whether a chosen subset can cover an item, where constraint $t_{ij}x_{j}\geq h_{i}$ has random constraint coefficients $t_{ij}$ or random right-hand side $h_{i}$ for $i\in V_{2},j\in V_{1}$ . A related class of risk-averse problems, referred to as the probabilistic set covering problems, consider a feasible set $\{x\in\mathbb{B}^{n}|\mathbb{P}(\sum_{j\in V_{1}}t_{ij}x_{j}\geq h_{i},\quad\forall i\in V_{2})\geq 1-\alpha\}$ , where $\alpha$ is a given risk level and $t_{ij}$ and/or $h_{i}$ are random variables for $i\in V_{2},j\in V_{1}$ . Here the objective is to select a minimum cost selection of subsets such that the probability of covering all items is at least the risk threshold $1-\alpha$ (see, e.g., [5, 19, 8, 2]). In contrast, in this paper, we assume that $h_{i}=1$ for all $i\in V_{2}$ , and that there is uncertainty on $t_{ij}$ for all $i\in V_{2}$ and $j\in V_{1}$ and consider a different type of risk aversion, where we aim to choose at most $k$ subsets from the collection so that the CVaR of the number of covered items at a risk level $\alpha$ is maximized. In this section, we consider an independent probability coverage model, where each node $j$ has an independent probability $a_{ij}$ of being covered by node $i\in V_{1}$ for $j\in S_{i}$ , i.e., $\mathbb{P}(t_{ij}=1)=a_{ij}$ .

Let $\sigma(x)$ be a random variable representing the number of covered items in $V_{2}$ for a given $x$ , i.e., $\sigma(x):=|\{i\in V_{2}:\exists j\in V_{1}\text{ with }x_{j}=1\text{ and }t_{ij}=1\}|$ . It is known that $\sigma(x)(\omega)$ is submodular for $\omega\in\varOmega$ [25]. Given an integer $k$ and $\alpha\in(0,1]$ , problem (3) for RASC is

[TABLE]

which is in the form of (3), where $\mathcal{X}$ is given by $\mathcal{X}=\{x\in\mathbb{B}^{n}:\sum_{i\in V_{1}}x_{i}\leq k\}$ . Next we propose an efficient CVaR oracle to evaluate the objective function in problem (14) for a given $x\in\mathcal{X}$ under the probability distribution of interest.

Proposition 3.1

There exists a polyniamial-time oracle that computes the function $\text{CVaR}_{\alpha}(\sigma(x))$ for $x\in\mathcal{X}$ for RASC under the independent probability coverage model.

Proof.

We follow the notation described in [25] to describe the CVaR oracle. From [9, 18, 21], we know that function $\mathbb{P}(\sigma(x)=b)$ is equal to the probability mass function of a Poisson binomial distribution and use a dynamic program (DP) to evaluate $A(x,i,j)$ , which is defined as the probability that the selection $x$ covers $j$ nodes among the first $i$ nodes of $V_{2}$ for $0\leq j\leq i,i\in V_{2}$ . Barlow and Heidtmann [4], Zhang et al. [27], and Wu and Küçükyavuz [25] use the DP to calculate $\mathbb{P}(\sigma(x)\geq b):=\sum_{j=b}^{m}A(x,m,j)$ . Next, we show that we can use the same recursion to evaluate $\text{CVaR}_{\alpha}(\sigma(x))$ . From the definition of $\text{VaR}_{\alpha}(\sigma(x))$ in (1), we have $\text{VaR}_{\alpha}(\sigma(x))=\min\{j\in\mathbb{Z}_{+}:\sum_{i=0}^{j}A(x,m,i)\geq\alpha\}.$ The function $\text{CVaR}_{\alpha}(\sigma(x))$ is given by

[TABLE]

For a given $x$ , the running time of the DP is $\mathcal{O}(nm+m^{2})$ , because obtaining $P(x,j)$ for all $j\in V_{2}$ is $\mathcal{O}(nm)$ , and computing the recursion is $\mathcal{O}(m^{2})$ . ∎

We study the computational performance of our proposed methods on RASC. All instances were executed on a Windows 8.1 operating system with an Intel Core i5-4200U 1.60 GHz CPU, 8 GB DRAM, and x64 based processor using C++ with IBM ILOG CPLEX 12.7. We set up the mixed-integer programming search method as traditional branch-and-cut with the lazycallback function of CPLEX, where the presolve process is turned off. The number of threads is equal to one. All other CPLEX options are set to their default values. The time limit is set to 1800 seconds. For RASC, we generate a complete bipartite graph with arcs from all nodes $i\in V_{1}$ to all $j\in V_{2}$ . Let $\mathcal{V}=V_{1}\cup V_{2}$ . (Note that $V=V_{1}$ for RASC, because we only select nodes from $V_{1}$ .) We follow the data generation scheme of [25] to set $n$ , $m$ , and $a_{ij}$ for each arc $(i,j)$ . We let $k\in\{3,5\}$ and $\alpha\in\{0.025,0.05\}$ . The size of the bipartite graphs is $|\mathcal{V}|\in\{50,100,150\}$ .

In Table 1, we demonstrate the performance of Algorithm 1, referred to as “Oracle”, using three types of valid inequalities, the L-shaped cut of [12], inequality (7) and lifted inequality (12). Note that for a given incumbent solution, $\bar{x}$ , which is a characteristic vector of the set $\bar{X}$ , we consider the L-shaped cut

[TABLE]

For $x_{j}=1$ for any $j\in V\setminus\bar{X}$ , inequality (15) gives a valid upper bound for $\psi=\text{CVaR}_{\alpha}(\sigma(\bar{x}))\big{)}$ , given by $\text{CVaR}_{1}(\sigma(\bar{x}))\big{)}$ for any $\alpha\in(0,1]$ , from Observation 1.1 (ii). Clearly, inequalities (7) are stronger than inequalities (15), because $\text{CVaR}_{1}(\sigma(\bar{x}+\mathbf{e}_{j}))\leq\text{CVaR}_{1}(\sigma(V))$ , for $j\in V\setminus\bar{X}$ . Column “Oracle-LShape” denotes Algorithm 1 with the L-shaped cut (15). Column “Oracle-Ineq (7)” denotes Algorithm 1 with inequality (7). Column “Oracle-Ineq (12)” denotes Algorithm 1 with inequality (12). Column “Oracle-Ineq (7) and (12)” denotes Algorithm 1 with both inequalities (7) and (12). Column “Time” denotes the total solution time for each instance for RMP, in seconds. Column “Cuts” denotes the total number of user cuts added to RMP. Column “Nodes” denotes the number of branch-and-bound nodes traced in RMP. From Table 1, we observe that the solution time increases as $k$ and $|\mathcal{V}|$ increase for all methods. We observe that Oracle-LShape cannot solve most instances within the time limit. Oracle-Ineq (7) and Oracle-Ineq (12) are faster than Oracle-LShape for the instances that are solvable by Oracle-LShape within the time limit. In addition, Oracle-Ineq (7) or Oracle-Ineq (12) generates a fewer number of optimality cuts and traces a fewer number of nodes compared to Oracle-LShape.

For most of the instances with $|\mathcal{V}|\leq 100$ , Oracle-Ineq (12) generates a fewer number of optimality cuts and traces a fewer number of nodes compared to Oracle-Ineq (7). However, for the instances with $|\mathcal{V}|=150$ , the solution time of Oracle-Ineq (12) is more than Oracle-Ineq (7). Recall that for a given incumbent solution $\bar{x}$ , inequality (12) is generated by Algorithm 2. In Algorithm 2, we observe that for a given $\bar{x}$ and $1\leq i\leq i^{\prime}\leq r$ , we have $\bar{\delta}_{j_{i}}(\bar{x})\leq\bar{\delta}_{j_{i}^{\prime}}(\bar{x})\leq\text{CVaR}_{\alpha}(\sigma(V_{1}))-\text{CVaR}_{\alpha}(\sigma(\bar{x})).$ From the above relation, if the size of $|\mathcal{V}|$ is large, then there may exist a large number of nodes with a high value of the lifting function $\delta_{j_{i}}(\bar{x})$ , which is close to the coefficients in the L-shaped cut (15), i.e. $\text{CVaR}_{\alpha}(\sigma(V_{1}))-\text{CVaR}_{\alpha}(\sigma(\bar{x}))$ . This explains why Oracle-Ineq (12) does not perform well in instances with large $|\mathcal{V}|$ . To benefit from the complementary strengths of inequalities (7) and (12), we add both classes of inequalities at each iteration in Algorithm 2. In Table 1, we observe that for the instances that are not solvable by either Oracle-Ineq (12) or Oracle-Ineq (7) within 1800 seconds, the solution time of Oracle-Ineq (7) and (12) is shorter. Furthermore, for a hard instance $(|V|,\alpha,k)=(150,0.05,5)$ , only Oracle-Ineq (7) and (12) provides an optimal solution within the time limit. In conclusion, our proposed inequalities enable the effective solution of difficult instances of the RASC problem that cannot be solved with existing methods within the set time limit.

Acknowledgments

We thank the AE and the reviewer for their comments that improved the presentation. This work is supported, in part, by the National Science Foundation Grant #1907463.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ahmed and Atamtürk [2011] S. Ahmed and A. Atamtürk. Maximizing a class of submodular utility functions. Mathematical Programming , 128(1):149–169, 2011.
2Ahmed and Papageorgiou [2013] S. Ahmed and D. J. Papageorgiou. Probabilistic set covering with correlations. Operations Research , 61(2):438–452, 2013.
3Artzner et al. [1999] P. Artzner, F. Delbaen, J. Eber, and D. Heath. Coherent measures of risk. Mathematical Finance , 9(3):203–228, 1999.
4Barlow and Heidtmann [1984] R. E. Barlow and K. D. Heidtmann. Computing k 𝑘 k -out-of- n 𝑛 n system reliability. IEEE Transactions on Reliability , 33(4):322–323, 1984.
5Beraldi and Ruszczyński [2002] P. Beraldi and A. Ruszczyński. The probabilistic set-covering problem. Operations Research , 50(6):956–967, 2002.
6Bian et al. [2017] A. A. Bian, B. Mirzasoleiman, J. Buhmann, and A. Krause. Guaranteed non-convex optimization: Submodular maximization over continuous domains. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , AISTATS, pages 111–120, 2017.
7Chen et al. [2018] L. Chen, H. Hassani, and A. Karbasi. Online continuous submodular maximization. In Proceedings of the 21th International Conference on Artificial Intelligence and Statistics , AISTATS, 2018.
8Fischetti and Monaci [2012] M. Fischetti and M. Monaci. Cutting plane versus compact formulations for uncertain (integer) linear programs. Mathematical Programming Computation , 4(3):239–273, 2012.