Risk-Averse Models in Bilevel Stochastic Linear Programming
J. Burtscheidt, M. Claus, S. Dempe

TL;DR
This paper studies bilevel stochastic linear programming models where the leader's risk-averse decision-making is analyzed under distributional perturbations, providing stability, continuity, and reformulation results.
Contribution
It introduces stability and differentiability results for risk measures in bilevel stochastic problems and offers a reformulation approach for finite discrete distributions.
Findings
Qualitative stability under probability distribution perturbations
Lipschitz continuity and differentiability conditions for risk measures
Reformulation of finite discrete distribution problems as standard bilevel problems
Abstract
We consider bilevel linear problems, where some parameters are stochastic, and the leader has to decide in a here-and-now fashion, while the follower has complete information. In this setting, the leader's outcome can be modeled by a random variable, which we evaluate based on some law-invariant convex risk measure. A qualitative stability result under perturbations of the underlying probability distribution is presented. Moreover, for the expectation, the expected excess, and the upper semideviation, we establish Lipschitz continuity as well as sufficient conditions for differentiability. Finally, for finite discrete distributions, we reformulate the bilevel stochastic problems as standard bilevel problems and propose a regularization scheme for bilevel linear problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Portfolio Optimization · Optimization and Variational Analysis · Optimization and Mathematical Programming
Risk-Averse Models in Bilevel Stochastic Linear Programming
J. Burtscheidt 111Faculty of Mathematics, University of Duisburg-Essen, Campus Essen, Thea-Leymann-Str. 9, D-45127 Essen, Germany, [johanna.burtscheidt][matthias.claus]@uni-due.de
M. Claus 111Faculty of Mathematics, University of Duisburg-Essen, Campus Essen, Thea-Leymann-Str. 9, D-45127 Essen, Germany, [johanna.burtscheidt][matthias.claus]@uni-due.de
S. Dempe 222Faculty of Mathematics and Computer Science, TU Bergakademie Freiberg, Akademiestraße 6, D-09599 Freiberg, Germany, [email protected]
Abstract
We consider bilevel linear problems, where some parameters are stochastic, and the leader has to decide in a here-and-now fashion, while the follower has complete information. In this setting, the leader’s outcome can be modeled by a random variable, which we evaluate based on some law-invariant convex risk measure. A qualitative stability result under perturbations of the underlying probability distribution is presented. Moreover, for the expectation, the expected excess, and the upper semideviation, we establish Lipschitz continuity as well as sufficient conditions for differentiability. Finally, for finite discrete distributions, we reformulate the bilevel stochastic problems as standard bilevel problems and propose a regularization scheme for bilevel linear problems.
Keywords: Bilevel Stochastic Programming, Risk Measures, Differentiability, Stability, Finite Discrete Models
AMS Subject Classification: 90C15, 90C26, 90C31, 90C34, 91A65
1 Introduction
Bilevel problems arise from the interplay of two decision makers at different levels of a hierarchy. The leader decides first and passes the upper level decision on to the follower. Incorporating the leader’s decision as a parameter, the follower then solves the lower level problem reflecting his or her own goals and returns an optimal solution back to the leader. The leader’s outcome depends on both his or her decision and the solution that is returned from the lower level. In bilevel optimization, it is assumed that the leader has full information about the influence of his or her decision on the lower level problem. As the latter may have more than one solution, models typically consider the case where the follower returns either the best (optimistic model) or the worst (pessimistic model) solution with respect to the leader’s objective. The bilevel optimization problem is to find an optimal upper level decision which, even in a linear setting, results in a nonconvex, nondifferentiable and NP-hard problem (cf. [9, Chapter 3]).
The present work is on bilevel stochastic linear problems, where the realization of some random vector whose distribution does not depend on the upper level decision enters the lower level problem as an additional parameter. It is assumed that the leader has to make his or her decision without knowing the realization of the randomness, while the follower decides under full information. This setting encapsulates two-stage stochastic programming with linear recourse as the special case, where the upper and lower level objective functions coincide.
In classical two-stage stochastic programming, the upper level objective function gives rise to a family of random variables defined by the optimal value function of the recourse problem. In contrast, the arising random variables in optimistic bilevel stochastic programming models depend on the optimal value of a problem where only optimal solutions of the lower level problem are feasible and the decision is made by a different actor. This is a crucial difference that entails a loss of convexity and poses additional challenges.
Nevertheless, bilevel stochastic problems are of great relevance for practical applications and have been discussed in the context of pricing of electricity swing options ([25]), economics ([6]), supply chain planning ([39]), telecommunications ([43]) and general agency problems ([18]). Other works focus on solution methods ([5]), bilevel stochastic problems with Knapsack constraints ([24]) and SMPECs ([29]).
In [21], Ivanov examines bilevel stochastic linear problems with uncertainty in the right-hand side of the lower level problem and utilizes the Value-at-Risk to rank the arising random variables. The results include continuity of the objective function, the existence of a solution, and equivalence to a mixed-integer linear program, if the underlying distribution is finite discrete. The latter result has been extended to the fully random case in [12].
In the present work, we rank the random variables arising from right-hand side uncertainty in the lower level by law-invariant risk measures. In particular, we consider the expectation, the expected excess over a fixed target level, the mean upper semideviation and the Conditional Value-at-Risk and establish Lipschitz continuity of the resulting objective function.
It is well known that stochastic programming models may be smoother than their underlying deterministic counterparts. For instance, for a class of stochastic Stackelberg games employing the expectation, differentiability has been derived in [8]. Overcoming additional challenges arising from nondifferentiable integrands, we establish continuous differentiability for bilevel stochastic linear problems using the expectation, the expected excess or the mean upper semideviation.
Incomplete information or the need for computational efficiency may lead to optimization models where an approximation of the true underlying distribution is employed. This motivates the analysis of the behavior of optimal values and (local) optimal solution sets under perturbations of the underlying distribution (see e.g. [30], [32] and [33] for stability analysis of related models). For bilevel stochastic linear problems, we establish a qualitative stability result that holds for all law-invariant convex risk measures.
All our results regarding finiteness, (Lipschitz) continuity, differentiability and stability cover both the optimistic and the pessimistic approach of bilevel stochastic linear programming.
For finite discrete distributions and optimistic models, we show that the risk-averse bilevel stochastic linear problems using the expectation, the expected excess or the mean upper semideviation are equivalent to standard bilevel linear problems. The resulting problems for the expectation and expected excess have at most one coupling constraint involving variables from different scenarios, which paves the way for decomposition approaches.
Finally, we show that a simplified version of the regularization scheme in [41] can be used to solve bilevel linear problems.
2 Model
Using the optimistic model, we shall consider parametric bilevel linear problems of the form
[TABLE]
where is nonempty, and are vectors, and is the lower level optimal solution set mapping defined by
[TABLE]
with matrices , and a vector . A bilevel stochastic program arises if we assume that the parameter is the realization of a known random vector defined on some probability space . We impose an additional non-anticipativity constraint that creates the following pattern of decision and observation:
Leader decides is revealed Follower decides .
Throughout the analysis, we assume the stochasticity to be purely exogenous, i.e. the distribution of to be independent of . In this setting, the leader’s decision gives rise to the random variable
[TABLE]
and the problem can be understood as picking an optimal random variable from the family . We shall rank these random variables according to some mapping , i.e. consider the bilevel stochastic problem
[TABLE]
We shall assume that there is some such that the restriction is real-valued, convex, nondecreasing w.r.t. the -almost sure partial order. Furthermore, let be law-invariant, i.e. whenever the induced Borel measures and coincide.
Remark 2.1**.**
The above assumptions are fulfilled for any law-invariant convex risk measure in the sense of [14, 17] (see also [15, 16]). However, we do not assume translation equivariance for the present analysis.
Example 2.2**.**
The expectation , 2. 2.
the expected excess over a fixed target level , 3. 3.
any weighted sum of the expectation and the upper semideviation with and 4. 4.
the Conditional Value at Risk
[TABLE]
for a fixed level (cf. **[37, Theorem 10]**)
are law-invariant and fulfill the above assumptions (see e.g. [42], [34]). In all of the above situations can be chosen as .
3 Structural properties
In this section, we shall consider the case where is given by the , or and examine properties of the mapping given by
[TABLE]
First, we shall prove that the function defined above is Lipschitz continuous and hence Borel measurable.
Lemma 3.1**.**
Assume that , then is real-valued and Lipschitz continuous on the polyhedron .
Proof.
By [13], implies . Consequently, the linear program in the definition of is solvable for any by parametric linear programming theory (see [2]). Consider any . Without loss of generality, assume that and let be such that . Following [22] we obtain
[TABLE]
for any . Let denote the Euclidean unit ball, then Theorem 7.1 in the Appendix yields
[TABLE]
and hence . ∎
Remark 3.2**.**
In view of Theorem 7.1 in the Appendix, the above result can be easily extended to the case of a convex quadratic lower level problem.
The next result follows directly from linear programming theory and provides verifiable conditions for :
Lemma 3.3**.**
* holds if and only if there exists such that*
* is nonempty,* 2. 2.
there is some satisfying and , and 3. 3.
the function is bounded from below on .
Under these conditions,
[TABLE]
is attained for any .
Under an appropriate moment condition, Lemma 3.1 implies finiteness and Lipschitz continuity of , and . Let
[TABLE]
denote the set of Borel probability measures on with finite moments of order .
Theorem 3.4**.**
Assume and . Then the mappings , , and are real-valued and Lipschitz continuous on
[TABLE]
for any , and .
Proof.
: Let be the Lipschitz constant from Lemma 3.1. For any and we have
[TABLE]
Furthermore,
[TABLE]
holds for any .
: Invoking and the Lipschitz continuity of on , finiteness and Lipschitz continuity of can be shown by the same arguments as for .
: Finiteness and Lipschitz continuity follow from the corresponding results for and .
: Consider the mapping , . By the results for , we have and the restriction is Lipschitz continuous w.r.t. the -norm. Consequently, the composition is finite an Lipschitz continuous on by [35, Corollary 3.7 and the subsequent remark].
∎
Under the assumptions of Theorem 3.4, the bilevel stochastic linear problem is solvable whenever is a nonempty compact subset of . A similar result holds for a comprehensive class of risk measure and shall be discussed in Section 4 (cf. Corollary 4.9).
We shall now focus on differentiability of . It will be convenient to reformulate as
[TABLE]
where
[TABLE]
Setting
[TABLE]
we obtain
[TABLE]
As the rows of are linearly independent, we may consider the nonempty set
[TABLE]
of lower level base matrices. A base matrix is optimal for the lower level problem for a given if it is feasible, i.e. , and the associated reduced cost vector is nonnegative. Furthermore, for any optimal base matrix , there exists a feasible base matrix satisfying
[TABLE]
Set
[TABLE]
and assume , then
[TABLE]
holds for any .
Definition 3.5**.**
The region of stability associated with a base matrix is the set
[TABLE]
Lemma 3.6**.**
Assume and let be an inner point of . Then is continuously differentiable at for any , where
[TABLE]
with
[TABLE]
Furthermore, is contained in a finite union of affine hyperplanes in and we have
[TABLE]
Proof.
and imply by definition. In view of (2), we have
[TABLE]
If holds for some , there is a neighborhood of such that holds for all . In particular, is continuously differentiable at and .
Suppose that for all . The continuity of implies that there are pairwise different base matrices such that
[TABLE]
In particular, we have
[TABLE]
i.e. for all . Thus, implies
[TABLE]
For any we shall consider the sets
[TABLE]
By (3) we have
[TABLE]
for all . Thus,
[TABLE]
We have , i.e. . Thus, implies
. Consequently, there is a neighborhood of such that for all . This implies for continuity reasons. Hence, , which contradicts for all .
It remains to show that is contained in a finite union of affine hyperplanes. Suppose that is such that holds for some , then . Consequently,
[TABLE]
is contained in a finite union of affine hyperplanes. Similarly we have
[TABLE]
, where due to the regularity of . Finally, is an affine hyperplane for any satisfying . ∎
Theorem 3.7**.**
Assume , , and let be such that . Then is continuously differentiable at and
[TABLE]
Proof.
We shall prove that Lemma 7.2 in the Appendix is applicable. First, note that condition (a) is satisfied by and Lemma 3.6. Furthermore, by there is neighborhood of that is contained in . In particular, is well-defined and finite by Proposition 3.4, i.e. the first part of condition (b) of Lemma 7.2 is satisfied. To see that the second part holds as well, let denote the Lipschitz constant from Lemma 3.1. Fix any and , then
[TABLE]
follows immediately from the characterization of the derivative in Lemma 3.6. Thus, Lemma 7.2 yields the differentiability of .
We shall now prove that the derivative is indeed continuous. By construction, there exists a neighborhood of such that holds for any . Consequently, by and the previous arguments, is differentiable at any and we have
[TABLE]
where and
[TABLE]
By Lemma 7.3 in the Appendix, the set-valued mapping ,
[TABLE]
is outer semicontinuous. Furthermore, by the arguments used in the proof of Lemma 3.6 we obtain
[TABLE]
and thus implies .
We shall use the above representation to prove that for any , the mapping , is continuous at . Consider any sequence that converges to . Without loss of generality we may assume that holds for all . We have
[TABLE]
where
[TABLE]
denotes the indicator function associated with the set and the final inequality is obtained by using Fatou’s Lemma. We shall show that
[TABLE]
holds for any . If the left-hand side in (4) equals zero, the above inequality holds because the right-hand side is nonnegative. On the other hand, implies that there is a subsequence of such that holds for all . Thus, by definition and (4) is satisfied.
Invoking (4) and the previous estimates we obtain
[TABLE]
where the second inequality holds due the outer semicontinuity of and the monotonicity of the indicator function. Consequently, is upper semicontinuous at for any .
By and the arguments used in the proof of Lemma (3.6),
[TABLE]
holds for any . By for any satisfying , (5) implies
[TABLE]
for any . Consequently, as is upper semicontinuous at for any , we obtain that
[TABLE]
is representable as a sum functions that are lower semicontinuous at . Thus, is continuous at for any , which implies the continuity of
[TABLE]
at . ∎
When working with the expected excess, the inner maximum may cause additional points of nondifferentiability.
Theorem 3.8**.**
Assume , , and let and be such that , where
[TABLE]
Then is continuously differentiable at .
Proof.
Consider the mapping given by
[TABLE]
which is finite and Lipschitz continuous on by Lemma 3.1. Consider any fixed z_{0}\in\mathrm{supp}\;\mu_{Z}\setminus\big{(}\mathcal{N}_{x_{0}}\cup\mathcal{L}(x_{0},\eta)\big{)}. If , there is a neighborhood of such that either for all or for all . In both cases is continuously differentiable at by Theorem 3.7.
Now consider the case where . The proof of Lemma 3.6 shows that there is some such that . In particular, we have and implies . Thus, and there is a neighborhood of such that for all . Hence, is continuously differentiable at .
Invoking Lemma 7.2 and the above considerations, the differentiability of and the continuity of
[TABLE]
at can be shown by a straightforward extension of the arguments used in the proof of Theorem 3.7. ∎
Theorem 3.9**.**
Assume , , and let be such that and . Then is continuously differentiable at for any .
Proof.
Fix any . By Theorem 3.7 and the definition of it is sufficient to show differentiability of the mapping . Consider the function defined by
[TABLE]
which is finite and Lipschitz continuous on by Lemma 3.1 and Theorem 3.4. Fix any z_{0}\in\mathrm{supp}\;\mu_{Z}\setminus\big{(}\mathcal{N}_{x_{0}}\cup\mathcal{L}(x_{0},Q_{\mathbb{E}}(x_{0}))\big{)} and suppose that . By the proof of Lemma 3.6 there is some such that . In particular, we have and implies . Hence, , which contradicts the assumptions.
Thus, and there is a neighborhood of such that either for all or for all . In both cases is continuously differentiable at by Theorem 3.7.
Consequently, the differentiability of and the continuity of
[TABLE]
at can be shown by a straightforward extension of the arguments used in the proof of Theorem 3.7. ∎
Corollary 3.10**.**
Assume and that is absolutely continuous with respect to the Lebesgue measure. Fix any , then and are continuously differentiable at any . Furthermore, for any , is continuously differentiable at any satisfying .
Proof.
: Since is a finite union of affine hyperplanes, i.e. a set with Lebesgue measure zero, holds for all and the statement is a direct consequence of Theorem 3.7.
: By definition, is a finite union of affine hyperplanes, which implies for any and Theorem 3.8 is applicable.
: For any fixed , is a set of Lebesgue measure zero and the statement follows from Theorem 3.9. ∎
The previous results give sufficient conditions for differentiability of the objective function of problem (1). In the presence of differentiability, necessary optimality can be formulated in terms of directional derivatives.
Proposition 3.11**.**
Assume , , and . Furthermore, let be a local minimizer of problem (1) and assume that is differentiable at . Then
[TABLE]
holds for any feasible direction
[TABLE]
Proof.
, and imply that is real-valued on by Corollary 4.9 below. For a proof of the necessity of (6) we refer to [3, Proposition 2.1.2]. ∎
Corollary 3.12**.**
Assume , , and let be absolutely continuous with respect to the Lebesgue measure. Furthermore, assume
[TABLE]
then any local minimizer of
[TABLE]
is an element of .
Proof.
Suppose that is a local minimizer of (1), then Corollary 3.10 and Proposition 3.11 yield as . Invoking the proof of Theorem 3.7 we have
[TABLE]
and thus , which contradicts the assumptions. ∎
4 A stability result for bilevel stochastic linear problems
The aim of this section is to establish a qualitative stability result for the bilevel stochastic linear problem (1) with respect to perturbations of the underlying probability measure. Taking into account that the support of the perturbed measure may differ from the original support, we shall assume that to ensure that the objective function of (1) remains well defined.
Throughout this section, we shall consider the general case where is law-invariant and there exists some such that the restriction is a real-valued convex risk measure. Furthermore, for the sake of notational simplicity, we assume that the probability space is atomless (cf. Remark 4.1 below). Then for any and , Lemma 3.1 implies and the atomlessness ensures that there exists some such that . Thus, we may consider the mapping ,
[TABLE]
Note that the specific choice of does not matter due to the law-invariance of .
Remark 4.1**.**
The assumption that is atomless does not entail a loss of generality: We may just fix an arbitrary atomless probability space , consider a law-invariant convex risk-measure and define an the restriction via , where is an arbitrary random variable in satisfying .
Consider the parametric optimization problem
[TABLE]
As () may be nonconvex, we shall pay special attention to sets of local optimal solutions. For any open set we introduce the optimal value function ,
[TABLE]
as well as the localized optimal solution set mapping ,
[TABLE]
It is well known that additional assumptions are needed when studying stability of local solutions.
Definition 4.2**.**
Given and an open set , is called a complete local minimizing (CLM) set of () w.r.t. if .
Remark 4.3**.**
The set of global optimal solutions and any set of isolated minimizers are CLM sets. However, in general, sets of strict local minimizers may fail to be CLM sets (cf. [36]).
In the following, we shall equip with the topology of weak convergence, i.e. the topology where a sequence converges weakly to , written , iff
[TABLE]
holds for any bounded continuous function (cf. [4]). The example below shows that even may fail to be weakly continuous on the entire space .
Example 4.4**.**
The problem
[TABLE]
arises from a bilevel stochastic linear problem, where holds for any . Assume that is the Dirac measure at [math]. Then the above problem can be rewritten as and its optimal value is [math].
However, while the sequence converges weakly to , replacing with yields the problem
[TABLE]
whose optimal value is equal to for any .
In the present work, we shall follow the approach of [7] and confine the stability analysis to locally uniformly -integrating sets.
Definition 4.5**.**
A set is said to be locally uniformly -integrating iff for any there exists some open neighborhood of w.r.t. the topology of weak convergence such that
[TABLE]
A detailed discussion of locally uniformly -integrating sets is provided in [15], [26], [27], and [28]. The following examples demonstrate the relevance of the concept.
Example 4.6**.**
(a) Fix . Then by [15, Corollary A.47, (c)], the set
[TABLE]
of Borel probability measures with uniformly bounded moments of order is locally uniformly -integrating.
(b) Fix any compact set . By [15, Corollary A.47, (b)], the set
[TABLE]
of Borel probability measures whose support is contained in is locally uniformly -integrating.
(c) Any singleton is locally uniformly -integrating by [27, Lemma 5.2].
Theorem 4.7**.**
Assume and . Let be locally uniformly -integrating, then
- (a)
* is real-valued and weakly continuous.* 2. (b)
* is weakly upper semicontinuous.*
In addition, assume that is such that is a CLM set of w.r.t. some open bounded set . Then the following statements hold true:
- (c)
* is weakly continuous at .* 2. (d)
* is weakly Berge upper semicontinuous at , i.e. for any open set with there exists a weakly open neighborhood of such that for all .* 3. (e)
There exists some weakly open neighborhood of such that is a CLM set for () w.r.t. for any .
Proof.
Fix any . By Lemma 3.1, is Lipschitz continuous on . Thus, there exists a constant such that
[TABLE]
and the result follows from [7, Corollary 2.4.]. ∎
Remark 4.8**.**
The assumption is equivalent to and holds if and only if there is some such that . By Gordan’s Theorem ([19]), the latter holds iff is the only nonnegative solution to . Under this condition, the feasible set of the lower level is full dimensional for any leader’s decision and any parameter .
If the underlying distribution is fixed, the assumptions of Theorem 4.7 (a) can be weakened significantly.
Corollary 4.9**.**
Assume and . Then is real-valued and continuous on . In addition, assume that is nonempty and compact, then problem (1) is solvable.
Proof.
The set is locally uniformly -integrating by Example 4.6. Thus, continuity of can be established as in the proof of Theorem 4.7 (a) and the solvability of (1) is a direct consequence of the compactness of . ∎
Example 4.10**.**
(a) The assumptions of Section 2 are fulfilled for the expected excess of order given by
[TABLE]
where is a fixed target level (cf. [42, Example 6.22]). Thus, the mapping is continuous on under the assumptions of Corollary 4.9.
(b) The mean upper semideviation of order given by
[TABLE]
is a law-invariant coherent risk measure for any by [42, Example 6.20]. Thus, Corollary 4.9 gives sufficient conditions for continuity of .
Remark 4.11**.**
All results of Sections 3 and 4 can be easily extended to the pessimistic approach to bilevel stochastic linear optimization, where takes the form
[TABLE]
As any Borel probability measure is the weak limit of a sequence of measures having finite support, Theorem 4.7 justifies an approach where the true underlying measure is approximated by a sequence of finite discrete ones.
5 Finite discrete distributions
Throughout this section, we shall assume that the underlying random vector is discrete with a finite number of realizations and respective probabilities . Let denote the index set , then takes the form
[TABLE]
Suppose that is such that holds for some . Then the probability of is a least , i.e. should be considered as infeasible for problem (1). Consequently, can be understood as an induced constraint. Note that is a polyhedron if is a polyhedron.
We shall show that for models involving the expectation, the expected excess or the mean upper semideviation, problem (1) can be reduced to a standard bilevel linear program.
Theorem 5.1** (Expectation).**
Assume and let be a polyhedron, then the risk neutral bilevel stochastic linear problem
[TABLE]
is equivalent to the optimistic bilevel linear program
[TABLE]
where is given by
[TABLE]
Proof.
We have
[TABLE]
and the result follows from . ∎
Remark 5.2**.**
The proof of Theorem 5.1 shows that the inner minimization problem in (8) can be decomposed into problems of similar structure.
Theorem 5.3** (Expected excess).**
Assume and let be a polyhedron, then for any , the risk-averse bilevel stochastic linear problem
[TABLE]
is equivalent to the optimistic bilevel linear program
[TABLE]
where is given by
[TABLE]
Proof.
We have , where
[TABLE]
holds for any . Thus,
[TABLE]
which completes the proof. ∎
Remark 5.4**.**
Let be given by
[TABLE]
then admits the representation
[TABLE]
Thus, the inner minimization problem in (9) decomposes into problems of similar structure.
Theorem 5.5** (Mean upper semideviation).**
Assume and let be a polyhedron, then for any , the risk-averse bilevel stochastic linear problem
[TABLE]
is equivalent to the optimistic bilevel linear program
[TABLE]
where is given by
[TABLE]
Proof.
By
[TABLE]
and the representation of that was established in the proof of Theorem 5.1, we have
[TABLE]
which completes the proof. ∎
Remark 5.6**.**
The inner minimization problem in (10) does not decompose scenariowise due to the coupling constraints for in the description of .
Finally, we shall consider models involving the Conditional Value at Risk.
Theorem 5.7** (Conditional Value at Risk).**
Assume and let be a polyhedron, then for any , the risk-averse bilevel stochastic linear problem
[TABLE]
is equivalent to
[TABLE]
Proof.
As
[TABLE]
the result follows directly from the representation of that was established in the proof Theorem 5.3. ∎
Remark 5.8**.**
Every evaluation the objective function in (11) corresponds to solving a bilevel linear problem with scalar upper level variable .
6 A regularization scheme for bilevel linear problems
In the setting of Theorems 5.1, 5.3 and 5.5, the risk-averse bilevel stochastic linear problem may be reformulated as a standard optimistic bilevel linear problem of the form
[TABLE]
where is given by
[TABLE]
for vectors , and , matrices and , and a nonempty polyhedron .
We shall discuss a solution approach for (12) that relies on replacing it with a regularized single level problem involving the Karush-Kuhn-Tucker (KKT) conditions of the lower level problem.
Theorem 6.1** (cf. [20, Theorem 3.7], [31]).**
Assume that is nonempty for any . Then the following statements hold true:
- (a)
The optimal values of (12) and
[TABLE]
coincide. 2. (b)
* is a global minimizer of (12) if and only if there exists some such that is a global minimizer of (13).* 3. (c)
* is a local minimizer of (12) if and only if there exists some such that is a local minimizer of (13).*
Proof.
By assumption, the mapping
[TABLE]
is well-defined and for any there exists some such that . Furthermore, holds for any , which implies (a), (b) and the ”if” part of (c).
To show the ”only if” part of (c), suppose that is a local minimizer of (13). Then there exist some such that
[TABLE]
holds for any satisfying and . In particular, we have for any , which implies that is a local and thus global minimizer of the linear program
[TABLE]
Consider the mapping defined by
[TABLE]
As is Lipschitz continuous by Theorem 7.1 in the Appendix, Lipschitz continuity of follows from the same result. Suppose that is not a local minimizer of (12), then there exist a sequence such that and and
[TABLE]
hold for any and we have . The Lipschitz continuity of and imply
[TABLE]
Thus, there exists a sequence satisfying and for all . Consequently, by (15), there is some such that for any , we have and
[TABLE]
which contradicts (14). Thus, is a local minimizer of (12). ∎
Next, we use the KKT conditions of the lower level problem to replace (13) with the single-level problem
[TABLE]
The relationship between bilevel problems and mathematical programs with complementarity constraints arising from the lower level KKT system has been investigated in [10]. In the special case of bilevel linear problems, the following holds:
Theorem 6.2** (cf. [10, Theorem 3.2]).**
- (a)
The optimal values of (13) and (16) coincide 2. (b)
* is a global minimizer of (13) if and only if there exists some such that is a global minimizer of (16).* 3. (c)
* is a local minimizer of (13) if and only if is a local minimizer of (16) for any satisfying and .*
Proof.
As the lower level problem is linear, its KKT conditions are necessary and sufficient for optimality. Thus, we have if and only if there exists some such that and , which implies (a), (b) and the ”if” part of (c).
To show the ”only if” part of (c), let be a local minimizer of (16) for any satisfying and and suppose that is not a local minimizer of (13). Then there exist sequences and such that , and for any we have and
[TABLE]
As the mapping given by
[TABLE]
is outer semicontinuous by Lemma 7.3 in the Appendix, there exists some such that
[TABLE]
holds for all . Fix any converging sequence such that holds for any . By (19) we have . Thus, is a local minimizer of (16). In particular, there exists some such that for all , which contradicts (17). ∎
It is known that often used regularity conditions as Mangasarian-Fromovitz constraint qualification or Slater’s constraint qualification are violated at every feasible point of (16) (cf. [40]). To overcome the difficulties related with this property, we propose to replace (16) by
[TABLE]
and solve this problem for . This approach and its use to solve general mathematical programs with equilibrium constraints has been investigated in [41]. For the special case of the bilevel linear optimization problem (13) we can prove the following result:
Theorem 6.3**.**
Let be an accumulation point of a sequence of local minimizers of problem for . Then is a local minimizer of (13).
Proof.
Without loss of generality, we may assume that converges. Suppose that is not a local minimizer of (13). Then, since is a polyhedron and is polyhedral (cf. [9, Theorem 3.1]), i.e. equal to the union of a finite number of polyhedra, there exist a direction and a sequence such that , and
[TABLE]
hold for any . As the mapping defined by (18) is outer semicontinuous, there exists a constant such that for any . In particular, there exists some vertex of such that is a vertex of for any . We shall prove that there exists some such that
[TABLE]
holds for any .
For any with and any , we have
[TABLE]
As and , this implies . Furthermore, since is feasible for , we conclude that
[TABLE]
for any .
Similarly, for any such that and , we obtain and thus
[TABLE]
for any .
Finally, for any such that and , the existence of some such that
[TABLE]
for any follows from the continuity of the mapping
[TABLE]
By the above considerations, we have
[TABLE]
and . Furthermore, as , is feasible for for any . Thus, we may assume that
[TABLE]
holds for any without loss of generality. (21), (22) and (23) imply that is feasible for for any .
Fix . We shall prove that for any , there is some such that
[TABLE]
is feasible for whenever . As \lim_{m\to\infty}(1-\lambda)\lambda\alpha_{m}v_{n}^{\top}\big{(}Wd_{w}-Bd_{u}\big{)}=0, there exists some such that
[TABLE]
for all . By (22), (23), and the feasibility of for , we have
[TABLE]
for any and feasibility follows from the linearity of the remaining restrictions.
As (21) implies ,
[TABLE]
holds for any and , which, by
[TABLE]
yields a contradiction to the local optimality of for . ∎
Remark 6.4**.**
Let be an accumulation point of a sequence of global minimizers of problem for . Then is a global minimizer of (13) (see the ideas in the proof of Theorem 2.1 in [11] in combination with [10]).
7 Appendix
We shall recall some technical results used throughout the paper.
Theorem 7.1** ([23, Theorem 4.2]).**
If positive semidefinite, the set-valued mapping given by
[TABLE]
is Lipschitz continuous on , i.e. there exists a constant such that holds for any .
The following result is a well-known direct consequence of Lebesgue’s Dominated Convergence Theorem:
Lemma 7.2**.**
Let be a Borel-probability measure on , open, and such that the following conditions are satisfied:
- (a)
* is differentiable at for -almost all and the derivative is measurable with respect to .* 2. (b)
There exists a neighborhood of such that
- (i)
the integral is well-defined and finite for all and 2. (ii)
there is an integrable function such that holds for all and -almost all , where
[TABLE]
Then , is differentiable at and
[TABLE]
Proof.
Set
[TABLE]
By assumption, we have for -almost all and Lebesgue’s Dominated Convergence Theorem implies
[TABLE]
which completes the proof. ∎
Lemma 7.3**.**
Let be closed, then the set-valued mapping ,
[TABLE]
is outer semicontinuous (cf. [38]), i.e. .
Proof.
By definition of the outer limit, holds if and only if there are sequences and such that
[TABLE]
For any such sequences we have for all and thus . Consequently, . ∎
Acknowledgement
The second author thanks the Deutsche Forschungsgemeinschaft for its support via the Collaborative Research Center TRR 154.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] B. Bank, J. Guddat, D. Klatte, B. Kummer and K. Tammer, Non-Linear Parametric Optimization , Akademie-Verlag, Berlin (1982).
- 2[2] K. Beer, Lösung großer linearer Optimierungsaufgaben , Deutscher Verlag der Wissenschaften, Berlin (1977).
- 3[3] D. P. Bertsekas, Nonlinear Programming , 2nd edition, Athena Scientific, Belmont, Massachusetts (1999).
- 4[4] P. Billingsley, Convergence of Probability Measures , Wiley, New York (1968).
- 5[5] S. I. Birbil, G. Gürkan and O. Listes, Simulation-based solution of stochastic mathematical problems with complementarity constraints: Sample-path analysis , Technical report, ERIM Report Series Research in Management, ERS-2004-016-LIS (2014).
- 6[6] M. Carrion, J. M. Arroyo and A. J. Conejo, A bilevel stochastic programming approach for retailer futures market trading , Power Systems, IEEE Transactions on, 24(3), pp. 1446-1456 (2009).
- 7[7] M. Claus, V. Krätschmer and R. Schultz, Weak continuity of risk functionals with applications to stochastic programming , SIAM Journal on Optimization, 27(1), pp. 91-108 (2017).
- 8[8] V. De Miguel and H. Xu, A stochastic multiple-leader Stackelberg model: analysis, computation, and application , Operations Research, 57(5), pp. 1220-1235 (2009).
