Newton-type algorithms for inverse optimization II: weighted span objective
Krist\'of B\'erczi, Lydia Mirabel Mendoza-Cadena, Kitti Varga

TL;DR
This paper introduces a new weighted span objective for inverse optimization, providing a min-max characterization and a polynomial-time Newton-type algorithm for finding optimal deviations, aiming for balanced cost modifications.
Contribution
It proposes the weighted span as a novel objective in inverse optimization and develops an efficient algorithm for its computation.
Findings
Min-max characterization of the weighted span
A strongly polynomial Newton-type algorithm for unit weights
Balanced cost modifications in inverse optimization
Abstract
In inverse optimization problems, the goal is to modify the costs in an underlying optimization problem in such a way that a given solution becomes optimal, while the difference between the new and the original cost functions, called the deviation vector, is minimized with respect to some objective function. The - and -norms are standard objectives used to measure the size of the deviation. Minimizing the -norm is a natural way of keeping the total change of the cost function low, while the -norm achieves the same goal coordinate-wise. Nevertheless, none of these objectives is suitable to provide a balanced or fair change of the costs. In this paper, we initiate the study of a new objective that measures the difference between the largest and the smallest weighted coordinates of the deviation vector, called the weighted span. We give a min-max…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Iterative Methods for Nonlinear Equations · Matrix Theory and Algorithms
Newton-type algorithms for inverse optimization II:
weighted span objective
Kristóf Bérczi
MTA-ELTE Momentum Matroid Optimization Research Group and ELKH-ELTE Egerváry Research Group, Department of Operations Research, Eötvös Loránd University, Budapest, Hungary. Email: [email protected], [email protected], [email protected].
Lydia Mirabel Mendoza-Cadena
MTA-ELTE Momentum Matroid Optimization Research Group and ELKH-ELTE Egerváry Research Group, Department of Operations Research, Eötvös Loránd University, Budapest, Hungary. Email: [email protected], [email protected], [email protected].
Kitti Varga
MTA-ELTE Momentum Matroid Optimization Research Group and ELKH-ELTE Egerváry Research Group, Department of Operations Research, Eötvös Loránd University, Budapest, Hungary. Email: [email protected], [email protected], [email protected].
Abstract
In inverse optimization problems, the goal is to modify the costs in an underlying optimization problem in such a way that a given solution becomes optimal, while the difference between the new and the original cost functions, called the deviation vector, is minimized with respect to some objective function. The - and -norms are standard objectives used to measure the size of the deviation. Minimizing the -norm is a natural way of keeping the total change of the cost function low, while the -norm achieves the same goal coordinate-wise. Nevertheless, none of these objectives is suitable to provide a balanced or fair change of the costs.
In this paper, we initiate the study of a new objective that measures the difference between the largest and the smallest weighted coordinates of the deviation vector, called the weighted span. We give a min-max characterization for the minimum weighted span of a feasible deviation vector, and provide a Newton-type algorithm for finding one that runs in strongly polynomial time in the case of unit weights.
Keywords: Algorithm, Bound-constraints, Inverse optimization, Min-max theorem, Span
1 Introduction
Informally, in an inverse optimization problem, we are given a feasible solution to an underlying optimization problem together with a linear objective function, and the goal is to modify the objective so that the input solution becomes optimal. Such problems were first considered by Burton and Toit [4] in the context of inverse shortest paths, and found countless applications in various areas ever since. We refer the interested reader to [12] for the basics of inverse optimization, to [9, 6] for surveys, and to [3, 1] for quick introductions.
There are several ways of measuring the difference between the original and the modified cost functions, the - and -norms being probably the most standard ones. Minimizing the -norm of the deviation vector means that the overall change in the costs is small, while minimizing the -norm results in a new cost function that is close to the original one coordinate-wise. However, these objectives do not provide any information about the relative magnitude of the changes on the elements compared to each other. Indeed, a deviation vector of -norm might increase the cost on an element by while decrease it on another by , thus resulting in a large relative difference between the two. Such a solution may not be satisfactory in situations when the goal is to modify the costs in a fair manner, and the magnitude of the individual changes is not relevant.
To overcome these difficulties, we consider a new type of objective function that measures the difference between the largest and the smallest weighted coordinates of the deviation vector, called the weighted span111The notion of span appears under several names in various branches of mathematics, such as range in statistics, amplitude in calculus, or deviation in engineering.. Although being rather similar at first glance, the -norm and the span behave quite differently as the infinite norm measures how far the coordinates of the deviation vector are from zero, while the span measures how far the coordinates of the deviation vector are from each other. In particular, it might happen that one has to change the cost on each element by the same large number, resulting in a deviation vector with large -norm but with span equal to zero. To the best of our knowledge, this objective was not considered before in the context of inverse optimization.
The present work is the second member of a series of papers that aims at providing simple combinatorial algorithms for inverse optimization problems under different objectives. In the first part [3] of the series, the authors considered the weighted bottleneck Hamming distance and weighted -norm objectives, and proposed an algorithm that determines an optimal deviation vector efficiently. The algorithm is based on the following Newton-type scheme: in each iteration, we check if the input solution is optimal, and if it is not, then we “eliminate” the current optimal solution by balancing the cost difference between them.
Here we work out the details of an analogous algorithm for the weighted span objective. As it turns out, finding an optimal deviation vector under this objective is significantly more challenging than it was for the weighted -norm. Intuitively, the complexity of the problem is caused by the fact that the underlying optimization problem may have feasible solutions of different sizes, and one has to balance very carefully between increasing the costs on certain elements while decreasing on others to obtain a feasible deviation vector, especially when the coordinates of the deviation vector are ought to fall within given lower and upper bounds.
Previous work.
In balanced optimization problems, the goal is to find a feasible solution such that the difference between the maximum and the minimum weighted variable defining the solution is minimized. Martello, Pulleyblank, Toth, and de Werra [10], Camerini, Maffioli, Martello, and Toth [5], and Ficker, Spieksma and Woeginger [8] considered the balanced assignment and the balanced spanning tree problems. Duin and Volgenant [7] introduced a general solution scheme for minimum deviation problems that is also suited for balanced optimization, and analyzed the approach for spanning trees, paths and Steiner trees in graphs. Ahuja [2] proposed a parametric simplex method for the general balanced linear programming problem as well as a specialized version for the balanced network flow problem. Scutellà [13] studied the balanced network flow problem in the case of unit weights, and showed that it can be solved by using an extension of Radzik’s [11] analysis of Newton’s method for linear fractional combinatorial optimization problems.
An analogous approach was proposed in the context of -norm objective by Zhang and Liu [14], who described a model that generalizes numerous inverse combinatorial optimization problems when no bounds are given on the changes. They exhibited a Newton-type algorithm that determines an optimal deviation vector when the inverse optimization problem can be reformulated as a certain maximization problem using dominant sets.
Problem formulation.
We denote the sets of real and positive real numbers by and , respectively. For a positive integer , we use . Let be a ground set of size . Given subsets , the symmetric difference of and is denoted by . For a weight function , the total sum of its values over is denoted by , where the sum over the empty set is always considered to be [math]. Furthermore, we define \frac{1}{w}(X)\coloneqq\sum\{\frac{1}{w(s)}\bigm{|}s\in X\}, and set . When the weights are rational numbers, then the values can be re-scaled as to satisfy being an integer for each . Throughout the paper, we assume that is given in such a form without explicitly mentioning it, implying that is a non-negative integer for every .
Let be a finite ground set, be a collection of feasible solutions for an underlying optimization problem, be an input solution, be a cost function, be a positive weight function, and and be lower and upper bounds, respectively, such that . We assume that an oracle is also available that determines an optimal solution of the underlying optimization problem for any cost function .
In the constrained minimum-cost inverse optimization problem under the weighted span objective \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)}, we seek a deviation vector such that
- (a)
is a minimum-cost member of with respect to , 2. (b)
is within the bounds , and 3. (c)
\operatorname*{span}_{w}(p)\coloneqq\max\left\{w(s)\cdot p(s)\bigm{|}s\in S\right\}-\min\left\{w(s)\cdot p(s)\bigm{|}s\in S\right\} is minimized.
Due to the lower and upper bounds and , it might happen that there exists no deviation vector satisfying the requirements. A deviation vector is called feasible if it satisfies conditions (a) and (b), and optimal if, in addition, it attains the minimum in (c). We denote the problem by \big{(}S,\mathcal{F},F^{*},c,-\infty,+\infty,\operatorname*{span}_{w}(\cdot)\big{)} when no bounds are given on the coordinates of at all, and call such a problem unconstrained.
As an extension, we also consider multiple underlying optimization problems at the same time. In this setting, instead of a single cost function, we are given cost functions together with an input solution , and our goal is to find a single vector with such that has minimum cost with respect to for all . In other words, condition (a) modifies to
- (a’)
is a minimum-cost member of with respect to for .
In case of multiple cost functions, we use instead of when denoting the problem.
Our results.
In [3], the authors gave Newton-type algorithms that determine an optimal deviation vector for bound-constrained minimum-cost inverse optimization problems under the weighted bottleneck Hamming distance and weighted -norm objectives. Here our main result is an algorithm for the weighted span objective that works along the same line. However, due to the different nature of the span, the algorithm and its analysis is significantly more complicated than the one for the -norm. We provide a min-max characterization for the weighted span of an optimal deviation vector in the unconstrained setting, i.e. when and . Then we give an algorithm for finding an optimal deviation vector that makes calls to the oracle . In particular, the algorithm runs in strongly polynomial time for unit weights if the oracle can be realized by a strongly polynomial algorithm. Finally, we briefly explain how to solve the problem when multiple cost functions are given instead of a single one.
The rest of the paper is organized as follows. In Section 2, we show that it is enough to look for an optimal deviation vector having a special form. The algorithm for the case of a single cost function in the constrained setting is presented in Section 3. Finally, we give a min-max characterization of the weighted span of an optimal deviation vector in the unconstrained setting as well as a sketch of the extension of the algorithm for multiple cost functions in Section 4.
2 Optimal deviation vectors
For the weighted -norm objective, the authors [3] verified the existence of an optimal deviation vector that corresponds to decreasing the costs on the elements of and increasing the costs on the elements of by the same value , scaled by the reciprocal of the weights and truncated according to the lower and upper bound constraints.
We first show that an analogous statement holds for the weighted span as well, which serves as the basic idea of our algorithm. Assume for a moment that there are no bound constraints and that is the unit weight function. If the members of have the same size, then one can always decrease the costs on the elements of by some in such a way that becomes a minimum-cost member of . As another extreme case, if is the unique minimum- or maximum-sized member of , then one can always shift the costs by the same number in such a way that becomes a minimum-cost member of . The idea of our approach is to combine these two types of changes in the general case by tuning the parameters , while also taking the weights and the bound constraints into account.
Consider an instance \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} of the constrained minimum-cost inverse optimization problem under the weighted span objective, where is a positive weight function. For any , let be defined as
[TABLE]
We simply write when and . The following observation shows that there exists an optimal deviation vector of special form.
Lemma 1**.**
Let \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} be a feasible minimum-cost inverse optimization problem and let be an optimal deviation vector. Then is also an optimal deviation vector, where and .
Proof.
The lower and upper bounds hold by definition, hence (b) is satisfied.
Now we show that (a) holds. The assumption and the definitions of and imply that and hold for every . Then for an arbitrary solution , we get
[TABLE]
where the last inequality holds by the feasibility of .
Finally, to see that (c) holds for , observe that and . That is, is also optimal, concluding the proof of the lemma. ∎
Corollary 2**.**
Let \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} be a feasible minimum-cost inverse optimization problem. Then there exist such that is an optimal deviation vector with
[TABLE]
Moreover,
[TABLE]
Proof.
The first half is straightforward from Lemma 1. Since and hold for any , the second statement follows. ∎
3 Algorithm
Consider an instance \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} of the constrained minimum-cost inverse optimization problem under the weighted span objective. The aim of this section is to give an algorithm that either finds a feasible deviation vector with minimum weighted span, or recognizes that the problem is infeasible. Let us highlight the main ideas towards achieving this.
Step 1. Deviation vector. If the problem is feasible, then there exists an optimal deviation vector of the form for some choice of by Corollary 2. Hence the problem reduces to identifying the values of and .
Step 2. Guessing and . With the help of Corollary 2, we identify two sets of intervals and such that the optimal and are contained in a member of and , respectively. Since , this enables us to reduce the original problem to subproblems, in each of which and are restricted to lie within fixed intervals. The optimal solution of the original problem is then the best of the optimal solutions for these subproblems.
Step 3. Modifying the bound-constraints. Fixing the intervals for and enables us to simplify the lower and upper bound-constraints on the coordinates of the deviation vector. This step changes neither the feasibility nor the set of optimal deviation vectors of the form for the corresponding subproblem.
Step 4. Characterizing feasibility. Once the bound-constraints are simplified, we can characterize the feasibility of the problem. This characterization is essential as, by recognizing infeasible instances, it serves as a stopping rule in the algorithm.
Step 5. Solving the subproblems. The idea of the algorithm is to eliminate bad sets in iteratively. Assume for a moment that no bounds are given on the deviation vector and that is the unit weight function. Let us call a set bad if it has smaller cost than the input solution with respect to the current cost function, small if and large if . A small or large bad set can be eliminated by decreasing or increasing the cost on every element by the same value, respectively. Note that such a step does not change the span of the deviation vector. Therefore, it is not enough to concentrate on single bad sets, as otherwise it could happen that we jump back and forth between the same pair of small and large bad sets by alternately decreasing and increasing the costs. To avoid the algorithm changing the costs in a cyclic way, we keep track of the small and large bad sets that were found the latest. If in the next iteration we find a bad set having the same size as , then we drop the latest small and large bad sets, if they exist, and eliminate on its own. However, if we find a small or large bad set , then we eliminate it together with the latest large or small bad set, if exists, respectively. For arbitrary weights, the only difference is that the size of a set is measured in terms of .
Nevertheless, the presence of bound constraints makes the problem more complicated. The difficulty is partially due to the fact that even if we hit one of the bounds on some element, the cost of that element might change later on. This is in sharp contrast to the -norm in [3], where the cost of an element becomes fixed once it reaches one of the bounds. Fortunately, with the help of Step 3, we can overcome these difficulties.
3.1 Guessing and
Let us order the elements of the ground set in such a way that . By the second half of Corollary 2, we may assume that
[TABLE]
and
[TABLE]
Indeed, the corollary says that , hence we may assume that is exactly this maximum for the elements of . Similarly, , hence we may assume that is exactly this minimum for the elements of . Now order the elements of as follows: we start with the elements of in a decreasing order according to , then followed by the elements of in a decreasing order according to . This gives an ordering as requested.
By the first half of Corollary 2, we may assume that the value of falls in one of the intervals
[TABLE]
and similarly, the value of falls in one of the intervals
[TABLE]
Therefore, we pair up the intervals of and in every possible way, and consider the problem of finding an optimal deviation vector subject to the bounds on and .
3.2 Modifying the bound-constraints
By the definition of , if for some , then holds for each . Similarly, if for some , then holds for each . These observations together with Corollary 2 imply that it suffices to consider instances of the minimum-cost inverse optimization problem where the lower and upper bounds are of the form
\Hy@raisedlink
SPEC-LU
\displaystyle\ell(s)\coloneqq\begin{cases}0&\text{if s\in S_{0},}\\ \ell^{\mathrm{in}}/w(s)&\text{if s\in F^{}\setminus S_{0},}\\ \ell^{\mathrm{out}}/w(s)&\text{otherwise,}\end{cases}\hskip 28.45274ptu(s)\coloneqq\begin{cases}0&\text{if s\in S_{0},}\\ u^{\mathrm{in}}/w(s)&\text{if s\in F^{}\setminus S_{0},}\\ u^{\mathrm{out}}/w(s)&\text{otherwise}\end{cases}
for some , and satisfying , , if , if ,
\displaystyle\max\left\{w(s)\bigm{|}s\in S\setminus F^{*}\right\}\cdot\ell^{\mathrm{out}} \displaystyle\leq\min\left\{w(s)\bigm{|}s\in F^{*}\right\}\cdot\ell^{\mathrm{in}},\ \text{and}
\displaystyle\max\left\{w(s)\bigm{|}s\in S\setminus F^{*}\right\}\cdot u^{\mathrm{out}} \displaystyle\leq\min\left\{w(s)\bigm{|}s\in F^{*}\right\}\cdot u^{\mathrm{in}}.
We refer to the set of these properties as (SPEC-LU). For ease of discussion, we define
[TABLE]
The function plays a key role in the rest of the proof.
3.3 Characterizing feasibility
The feasibility of the modified problem is characterized by the following lemma.
Lemma 3**.**
Let \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} be a minimum-cost inverse optimization problem, where and satisfy (SPEC-LU). Let
[TABLE]
- (a)
If and , then the minimum-cost inverse optimization problem is feasible if and only if is a feasible deviation vector. 2. (b)
If and , then the minimum-cost inverse optimization problem is feasible if and only if is a feasible deviation vector, where
[TABLE] 3. (c)
If and , then the minimum-cost inverse optimization problem is feasible if and only if is a feasible deviation vector, where
[TABLE] 4. (d)
If and , then the minimum-cost inverse optimization problem is feasible if and only if is a feasible deviation vector, where
[TABLE]
Proof.
In all cases, the ‘if’ direction is straightforward, hence we prove the ‘only if’ direction.
Assume first that and , and let be a feasible deviation vector. For any , we have
[TABLE]
implying that is also feasible.
Now assume that and , let again be a feasible deviation vector and be arbitrary. If , i.e. , then we get
[TABLE]
where the last inequality follows from the definition of . On the other hand, if , i.e. , then we get
[TABLE]
implying that is also feasible.
The remaining two cases can be verified analogously. ∎
3.4 Solving the subproblems
Finally, we derive an algorithm for solving a subproblem obtained after modifying the bound-constraints. The algorithm is presented as Algorithm 1. By convention, undefined objects are denoted by . We slightly modify the notion of small and large bad sets from the beginning of Section 3 by calling a set small if and large if . The high level description of the algorithm is as follows. At each iteration , we distinguish five main cases depending on the bad sets eliminated in that iteration. In Case 1, we eliminate a bad set having the same size as . In Case 2, we eliminate a small bad set alone. In Case 3, we eliminate a small bad set together with a large bad set that was found before. In Case 4, we eliminate a large bad set alone. Finally, in Case 5, we eliminate a large bad set together with a small bad set that was found before.
In all cases, first we determine the values of (corresponding to the change on the elements of ) and (corresponding the change on the elements of ) needed to eliminate the bad sets in question as if no bound-constraints were given. In Cases 1.1, 2.1.1, 3.1.1, 4.1.1 and 5.1.1, the resulting deviation vector do not violate the bound-constrains, hence we apply the changes directly. In the remaining cases however, we hit a bound-constraint either on all the elements of , or on those of , or both (recall that the bound-constraints are assumed to satisfy (SPEC-LU) at this point). In such situations, we set the changes on the problematic elements to the extreme, that is, to the lower or upper bound that was violated, recompute the necessary changes on the remaining elements to eliminate the bad sets in question, and set the deviation vector accordingly if the bound-constraints are met, otherwise conclude that the problem is infeasible. The algorithm considers all the possible scenarios, and shows how to handle these cases using the values computed by the functions defined below.
The algorithm and its analysis is rather technical, and requires the discussion of several cases. Nevertheless, it is worth emphasizing that in each step we apply the natural modification to the deviation vector that is needed to eliminate the current bad set or sets. At this point, the reader may rightly ask whether it is indeed necessary to consider all these cases. The answer is unfortunately yes, as the proposed Newton-type scheme may run into any of them, see Figure 4.
For ease of discussion, we introduce several notation before stating the algorithm. Let
[TABLE]
In each iteration, the algorithm computes an optimal solution of the underlying optimization problem, and if the cost of the input solution is strictly larger than the current optimum, then it updates the costs using a value determined by one of the following functions:
- (f1)
, 2. (f2)
, 3. (f3)
, 4. (f4)
, 5. (f5)
f_{5}(c,d,D,F^{\prime})\coloneqq\displaystyle\frac{c(F^{*})-c(F^{\prime})-(u^{\mathrm{in}}-d-D)\cdot\big{(}\mu(F^{*})-\mu(F^{\prime})\big{)}}{\mu\left(F^{\prime}\setminus F^{*}\right)}, 6. (f6)
f_{6}(c,D,F^{\prime})\coloneqq\displaystyle\frac{c(F^{*})-c(F^{\prime})-(u^{\mathrm{out}}-D)\cdot\big{(}\mu(F^{*})-\mu(F^{\prime})\big{)}}{\mu\left(F^{*}\setminus F^{\prime}\right)}, 7. (f7)
f_{7}(c,F^{\prime},F^{\prime\prime\prime})\coloneqq\frac{\hphantom{x}\displaystyle\frac{c(F^{*})-c(F^{\prime})}{\vphantom{\Big{|}}\mu(F^{*})-\mu(F^{\prime})}-\frac{c(F^{*})-c(F^{\prime\prime\prime})}{\vphantom{\Big{|}}\mu(F^{*})-\mu(F^{\prime\prime\prime})}\hphantom{x}}{\hphantom{x}\displaystyle\frac{\vphantom{\Big{|}}\mu\left(F^{*}\setminus F^{\prime}\right)}{\vphantom{\Big{|}}\mu(F^{*})-\mu(F^{\prime})}-\frac{\vphantom{\Big{|}}\mu\left(F^{*}\setminus F^{\prime\prime\prime}\right)}{\vphantom{\Big{|}}\mu(F^{*})-\mu(F^{\prime\prime\prime})}\hphantom{x}}, 8. (f8)
, 9. (f9)
f_{9}(c,D,F^{\prime\prime\prime})\coloneqq\displaystyle\frac{c(F^{*})-c(F^{\prime\prime\prime})-(\ell^{\mathrm{out}}-D)\cdot\big{(}\mu(F^{*})-\mu(F^{\prime\prime\prime})\big{)}}{\mu\left(F^{*}\setminus F^{\prime\prime\prime}\right)}, 10. (f10)
, 11. (f11)
f_{11}(c,d,D,F^{\prime\prime\prime})\coloneqq\displaystyle\frac{c(F^{*})-c(F^{\prime\prime\prime})-(\ell^{\mathrm{in}}-d-D)\cdot\big{(}\mu(F^{*})-\mu(F^{\prime\prime\prime})\big{)}}{\mu\left(F^{\prime\prime\prime}\setminus F^{*}\right)}, 12. (f12)
.
In the definitions above, we have , , , , and almost always, except – to avoid division by zero – for and where , and for where .
Figure 4 in Appendix B provides toy examples for the different cases occurring in Algorithm 1, while Table 1 in Appendix C gives a summary of the cases which might be helpful when reading the proof. For proving the correctness and the running time of the algorithm, we need the following lemmas. The proofs of these statements follow from the definitions in a fairly straightforward way, hence those are deferred to Appendix A.
Our first lemma shows that if is not optimal with respect to the current cost function, then in the next step it either has the same cost as the current optimal solution with respect to the modified cost function, or the problem is infeasible.
Lemma 4**.**
If is not a minimum -cost member of , then either or Algorithm 1 declares the problem to be infeasible.
The following lemmas together imply an upper bound on the total number of iterations.
Lemma 5**.**
Let be indices such that both steps and corresponds to Cases 1.1 or 1.2.1. Then .
Lemma 6**.**
Let be a 4-tuple of integers satisfying for any .
- (a)
There is at most one index such that is not a minimum -cost member of , , and \big{(}\mu(F_{i}),\mu(Z_{i}),\mu(F_{i}\cap F^{*}),\mu(Z_{i}\cap F^{*})\big{)}=(a_{1},a_{2},a_{3},a_{4}). 2. (b)
There is at most one index such that is not a minimum -cost member of , , and \big{(}\mu(X_{i}),\mu(F_{i}),\mu(X_{i}\cap F^{*}),\mu(F_{i}\cap F^{*})\big{)}=(a_{1},a_{2},a_{3},a_{4}).
Lemma 7**.**
**
- (a)
If and , then at least one of and holds. 2. (b)
If and , then at least one of and holds.
The essence of the next lemma is that for a feasible instance satisfying (SPEC-LU), there exists an optimal deviation vector where and can be bounded.
Lemma 8**.**
Let \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} be a feasible minimum-cost inverse optimization problem where the bound-constraints satisfy (SPEC-LU). Then there exist with and such that is an optimal deviation vector.
Finally, we show that analogous bounds hold for the values and computed throughout the algorithm.
Lemma 9**.**
Either and , or Algorithm 1 declares the problem to be infeasible. Moreover, in Cases 2.1.2.1, 2.2.2.1, 3.1.2.1, and 3.2.2.1, we have , and in Cases 4.2.1 and 5.2.1, we have .
Now we are ready to prove the correctness and discuss the running time of the algorithm.
Theorem 10**.**
Algorithm 1 determines an optimal deviation vector, if exists, for the minimum-cost inverse optimization problem \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} with bound-constraints satisfying (SPEC-LU) using O\big{(}\|w\|_{\text{-}1}^{6}\big{)} calls to .
Proof.
We discuss the time complexity and the correctness of the algorithm separately.
Time complexity. Recall that is scaled so that is an integer for each . We show that the algorithm terminates after at most iterations of the while loop. By Lemma 5, there are at most iterations corresponding to Cases 1.1 and 1.2.1. By Lemma 6, there are at most iterations corresponding to Cases 3.1.1, 3.1.2.1, 3.2.1, 3.2.2.1, 5.1.1, 5.1.2.1, 5.2.1, and 5.2.2.1. Between two such iterations, there are at most iterations corresponding to the remaining cases by Lemma 7. Hence the total number of iterations is at most .
Infeasibility. By the above, the algorithm terminates after a finite number of iterations. First, we show that if the algorithm returns Infeasible, then it correctly recognizes the problem to be infeasible. Assume that the algorithm terminates in the th iteration and declares the problem to be infeasible. We distinguish different scenarios depending on which case the last step belongs to.
Case 1.2.2. Note that and . In addition,
[TABLE]
Thus,
[TABLE]
so by Lemma 3, the problem is infeasible.
Case 2.1.2.2 when . Note that . If , then
[TABLE]
and if , then the same calculation applies for . So by Lemma 3, the problem is infeasible.
Cases 2.1.2.2, 2.2.2.2, 3.1.2.2 and 3.2.2.2 when . Note that and . In addition,
[TABLE]
Using that , we obtain
[TABLE]
so by Lemma 3, the problem is infeasible.
Cases 2.2.2.2 and 3.2.2.2 when . Note that . In addition,
[TABLE]
Therefore, the exact same calculation applies as in Case 2.1.2.2 when . So by Lemma 3, the problem is infeasible.
Case 3.1.2.2 when . Note that . In addition,
[TABLE]
Therefore, the exact same calculation applies as in Case 2.1.2.2 when . So by Lemma 3, the problem is infeasible.
Case 4.1.2.2 when . Note that . If , then
[TABLE]
and if , then the same calculation applies for . So by Lemma 3, the problem is infeasible.
Cases 4.1.2.2, 4.2.2.2, 5.1.2.2 and 5.2.2.2 when . Note that and . Using that , we obtain
[TABLE]
so by Lemma 3, the problem is infeasible.
Cases 4.2.2.2 and 5.2.2.2 when . Note that . In addition,
[TABLE]
Therefore, the exact same calculation applies as in Case 4.1.2.2 when . So by Lemma 3, the problem is infeasible.
Case 5.1.2.2 when . Note that and
[TABLE]
Therefore, the exact same calculation applies as in Case 4.1.2.2 when . So by Lemma 3, the problem is infeasible.
Optimality. Assume now that the algorithm terminates with returning a vector whose feasibility follows from the fact that the while loop ended. If is a minimum -cost member of , then we are clearly done. Otherwise, there exists an index such that is a minimum -cost member of . Suppose to the contrary that is not optimal. By Lemma 8, there exists such that , the deviation vector is optimal, , and . By Lemma 9, we know that and hold. If all steps correspond to Cases 2.1.1 and 4.1.1, then , a contradiction. Otherwise, let be the largest index for which . Note that . We arrive to a contradiction using different arguments, depending on which case step belongs to.
Cases 1.1 and 1.2.1. By Lemma 4, we have
[TABLE]
a contradiction.
Cases 2.1.2.1, 2.2.2.1, 3.1.2.1, and 3.2.2.1. Note that
[TABLE]
Therefore, by Lemma 4, we have
[TABLE]
a contradiction.
Cases 2.2.1 and 3.2.1. Note that holds. Therefore, by Lemma 4, we have
[TABLE]
a contradiction.
Case 3.1.1. We have
[TABLE]
and
[TABLE]
These together imply
[TABLE]
Therefore, we have
[TABLE]
By the above, we obtain
[TABLE]
a contradiction.
Cases 4.1.2.1, 4.2.2.1, 5.1.2.1 and 5.2.2.1. Note that holds. Thus, by Lemma 4,
[TABLE]
a contradiction.
Cases 4.2.1 and 5.2.1. Note that
[TABLE]
Thus, by Lemma 4,
[TABLE]
a contradiction.
Case 5.1.1. Similarly as in Case 3.1.1, we obtain
[TABLE]
Therefore, we get
[TABLE]
a contradiction. ∎
As we showed in Section 3 that an arbitrary instance \big{(}S,\mathcal{F},F^{*},c,\ell,u,\operatorname*{span}_{w}(\cdot)\big{)} of the constrained minimum-cost inverse optimization problem under the weighted span objective can be reduced to solving subproblems, we get the following result.
Corollary 11**.**
There exists an algorithm that determines an optimal deviation vector, if exists, for the bound-constrained minimum-cost inverse optimization problem under the weighted span objective using calls to .
4 Min-max characterization and multiple costs
With the help of Corollary 2, we give a min-max characterization for the weighted span of an optimal deviation vector in the unconstrained setting, even for the case of multiple cost functions. The theorem may seem complicated at first glance as it involves a rather complex formula. However, let us note that the fractions appearing on the maximum side of the characterization are natural lower bounds for the minimum span of a feasible deviation vector, and the complexity of the formula is simply due to the presence of weights and bound constraints.
Theorem 12**.**
Let \big{(}S,\mathcal{F},F^{*},\{c^{j}\}_{j\in[k]},-\infty,+\infty,\operatorname*{span}_{w}(\cdot)\big{)} be a feasible minimum-cost inverse optimization problem. Then
[TABLE]
where
[TABLE]
Proof.
Let be an optimal deviation vector. By Corollary 2, we may assume that is of the form for some such that and . Note that clearly holds. For ease of discussion, let . Using this notation, we define
[TABLE]
Let and be an arbitrary solution. Since is feasible, we have
[TABLE]
Thus for any and with and , if such exists,
[TABLE]
for any and with , if such exists,
[TABLE]
and for any and with , if such exists,
[TABLE]
By the above, for any and with , if such and exist, we have
[TABLE]
implying
[TABLE]
Therefore holds. To prove , it suffices to show that is a feasible deviation vector, that is, has minimum cost with respect to for every . For any and with , if such exists,
[TABLE]
For any and with and , if such exists,
[TABLE]
Let and with be arbitrary, if such exists. First note that
[TABLE]
holds since otherwise there would exist with such that
[TABLE]
contradicting the definition of . Thus we get
[TABLE]
This shows that is indeed feasible, concluding the proof of the theorem. ∎
In [3], the authors showed that the algorithm for the weighted -norm objective naturally extends to the case of multiple cost functions. Roughly, this is doable since it suffices to compute an optimal deviation vector (of a special form) for each cost function separately, and one of these vectors can be shown to be optimal for the multiple costs version as well.
Unfortunately, a similar approach does not apply for the weighted span objective, see Figure 1 for an example. To overcome this, instead of considering the cost functions separately, one can run Algorithm 1 for them simultaneously. That is, we still keep track of the small and large bad sets that were found the latest, but now together with the cost function for which those were bad. In each iteration, we pick an index such that is not optimal with respect to the th modified cost function, and update the deviation vector according to the same rules that were applied to the case of a single cost function. If the number of cost functions is , then this adds an additional factor of to the running time of the algorithm, that is, it makes calls to oracle .
5 Conclusions
In this paper, we introduced an objective for minimum-cost inverse optimization problems that measures the difference between the largest and the smallest weighted coordinates of the deviation vector, thus leading to a fair or balanced solution. We presented a purely combinatorial algorithm that efficiently determines an optimal deviation vector, assuming that an oracle for solving the underlying optimization problem is available. The running time of the algorithm is pseudo-polynomial due to the presence of weights, and finding a strongly polynomial algorithm remains an intriguing open problem.
Acknowledgement.
The work was supported by the Lendület Programme of the Hungarian Academy of Sciences – grant number LP2021-1/2021 and by the Hungarian National Research, Development and Innovation Office – NKFIH, grant number FK128673.
Appendix
Appendix A Deferred proofs
A.1 Proof of Lemma 4
Proof of Lemma 4.
Assume that is not a minimum -cost member of and that Algorithm 1 does not declare the problem to be infeasible in the th step. We distinguish the different cases the th step might belong to.
Cases 1.1 and 1.2.1. Since
[TABLE]
we have
[TABLE]
Cases 2.1.1, 2.2.1, 3.1.1, 3.2.1, 4.1.1, 4.1.2.1, 4.2.2.1, 5.1.1, 5.1.2.1, and 5.2.2.1. In Cases 2.1.1, 3.1.1, 4.1.1, and 5.1.1, we have
[TABLE]
while in Cases 2.2.1, 3.2.1, 4.1.2.1, 4.2.2.1, 5.1.2.1, and 5.2.2.1, we have
[TABLE]
Therefore in all these cases, we obtain
[TABLE]
Cases 2.1.2.1, 2.2.2.1, 3.1.2.1, and 3.2.2.1. Since and , we have
[TABLE]
Cases 4.2.1 and 5.2.1. Since and , we have
[TABLE]
This concludes the proof of the lemma. ∎
A.2 Technical claims
Before moving the the proofs of the other lemmas, we need the following technical claims.
Claim 13**.**
If and , then
[TABLE]
If and , then
[TABLE]
Proof.
The statement directly follows from the definitions of the functions , , and . ∎
Claim 14**.**
**
- (a)
If and , then either or Algorithm 1 declares the problem to be infeasible. 2. (b)
If and , then either or Algorithm 1 declares the problem to be infeasible.
Proof.
Assume that Algorithm 1 does not declare the problem to be infeasible in the th step. First, consider the case when and . Then the th step corresponds to Case 3.1.1, and . Thus, by Claim 13, we have
[TABLE]
Now consider the case when and . Then in the th step corresponds to Case 5.1.1, and . Thus, by Claim 13, we have
[TABLE]
This concludes the proof of the claim. ∎
Claim 15**.**
**
- (a)
If , then either or Algorithm 1 declares the problem to be infeasible. 2. (b)
If , then either or Algorithm 1 declares the problem to be infeasible.
Proof.
Assume that Algorithm 1 does not declare the problem to be infeasible in the th step. First, consider the case when . Then in the th step we were not in Cases 1.1, 1.2.1, or 1.2.2, thus . If , then and the statement follows from Lemma 4. If , then and the statement follows from Claim 14.
The case when can be proved analogously. ∎
Claim 16**.**
If is not a minimum -cost member of , then either and equality holds if and only if at the th step belongs to Case 2.1.1 or 4.1.1, or Algorithm 1 declares the problem to be infeasible. Furthermore, if the th step belongs to Case 2.1.1, and if the th step belongs to Case 4.1.1. In addition, .
Proof.
Clearly, . Assume that is not a minimum -cost member of and that Algorithm 1 does not declare the problem to be infeasible in the th step. We distinguish the different cases the th step might belong to.
Cases 1.1 and 1.2.1. We have
[TABLE]
Cases 2.1.1 and 4.1.1. Clearly, . Furthermore, in Case 2.1.1,
[TABLE]
and in Case 4.1.1,
[TABLE]
Case 2.1.2.1. We have
[TABLE]
Case 2.2.1. We have
[TABLE]
Case 2.2.2.1. Note that
[TABLE]
Thus, we have
[TABLE]
Case 3.1.1. By Claim 15, we have
[TABLE]
Case 3.1.2.1. By Claim 15, similarly as in Case 3.1.1, holds, implying
[TABLE]
Thus we have
[TABLE]
Case 3.2.1. By Claim 15, similarly as in Case 3.1.1, holds. Thus we have
[TABLE]
Case 3.2.2.1. By Claim 15, similarly as in Case 3.1.1, holds, so
[TABLE]
Thus, we have
[TABLE]
This implies
[TABLE]
Case 4.1.2.1. We have
[TABLE]
Case 4.2.1. We have
[TABLE]
Case 4.2.2.1. Note that
[TABLE]
Thus, we have
[TABLE]
Case 5.1.1. By Claim 15,
[TABLE]
Case 5.1.2.1. By Claim 15, similarly as in Case 5.1.1, holds, so
[TABLE]
Thus, we have
[TABLE]
Case 5.2.1. By Claim 15, similarly as in Case 5.1.1, holds, so
[TABLE]
Thus, we have
[TABLE]
Case 5.2.2.1. By Claim 15, similarly as in Case 5.1.1, holds, and similarly as in Case 5.2.1, , so
[TABLE]
Thus, we have
[TABLE]
This concludes the proof of the claim. ∎
Claim 17**.**
If and , then . If and , then .
Proof.
We distinguish the different cases the th step might belong to.
Cases 3.1.1 and 5.1.1. In this case and we are done.
Case 3.1.2.1. Note that
[TABLE]
Thus, we have
[TABLE]
Case 3.2.1. We have
[TABLE]
Case 3.2.2.1. Note that
[TABLE]
Thus, we have
[TABLE]
Case 5.1.2.1. We have
[TABLE]
Case 5.2.1. Note that
[TABLE]
Thus, we have
[TABLE]
Case 5.2.2.1. Similarly as in Case 5.2.1, we obtain
[TABLE]
Thus, we have
[TABLE]
This concludes the proof of the claim. ∎
Claim 18**.**
**
- (a)
If and for some , then
[TABLE]
where equality implies . 2. (b)
If and for some , then
[TABLE]
where equality implies .
Proof.
First, consider the case when . Without loss of generality, we may assume that and . Suppose to the contrary that . By Claim 16,
[TABLE]
which is only possible if , , and . This implies that each of the th, th, , th steps belong to Case 2.1.1, thus . Note that is not a minimum -cost member of since . Hence, by Lemma 4,
[TABLE]
a contradiction.
Now assume . Then
[TABLE]
Thus, we have
[TABLE]
The second statement of the lemma can be proved analogously. ∎
Claim 19**.**
Let be such that , and let be such that . Then neither
[TABLE]
nor
[TABLE]
hold.
Proof.
First, suppose to the contrary that
[TABLE]
Then
[TABLE]
and
[TABLE]
Therefore, we get
[TABLE]
Using this, we get
[TABLE]
a contradiction.
The second half of the lemma can be proved analogously. ∎
A.3 Proof of Lemma 5
Proof of Lemma 5.
Without loss of generality, we may assume and .
First, we show . By Lemma 4 and Claim 16,
[TABLE]
Now suppose to the contrary that . Then by the above observation and by Claim 16, we get
[TABLE]
a contradiction. ∎
A.4 Proof of Lemma 6
Proof of Lemma 6.
We prove (a), the proof of (b) is analogous. Suppose to the contrary that there exist two indices and with satisfying the conditions of part (a) of the lemma. Then, by Claim 18, we have . Moreover, since \big{(}\mu(F_{i_{1}}),\mu(Z_{i_{1}}),\mu(F_{i_{1}}\cap F^{*}),\mu(Z_{i_{1}}\cap F^{*})\big{)}=\big{(}\mu(F_{i_{2}}),\mu(Z_{i_{2}}),\mu(F_{i_{2}}\cap F^{*}),\mu(Z_{i_{2}}\cap F^{*})\big{)}, it also follows that for any by the assumption that (SPEC-LU) holds. This implies for any . By Claims 16 and 17, we get
[TABLE]
In addition, by Claim 15, we have . Thus, applying Claim 19 for , , and , we get that cannot occur, contradicting the fact that is a minimum -cost member of while is not. ∎
A.5 Proof of Lemma 7
Proof of Lemma 7.
Suppose to the contrary that and . Then by Claim 16, we get
[TABLE]
a contradiction. ∎
A.6 Proof of Lemma 8
Proof of Lemma 8.
Recall that we assumed (SPEC-LU) to hold. By Corollary 2, there exist with and for which is an optimal deviation vector.
If and , then set and . If and , then set and . If and , then set and . Finally, if and , then set and . It is not difficult to check that in all cases, we get that , and hold. ∎
A.7 Proof of Lemma 9
Proof of Lemma 9.
Recall that we assumed (SPEC-LU) to hold. We prove the lemma by induction on .
First we show that the statements hold for . If and , then there are two cases. If , then and , thus and . If , then and , thus and . If and , then and , thus and . If and , then and , thus and . If and , then and , thus and .
Now let be arbitrary and assume that and and that Algorithm 1 does not declare the problem to be infeasible in the th step. We distinguish the different cases the th step might belong to.
Case 1.1. We have and , hence we are done by Claim 16 and by the induction hypothesis.
Case 1.2.1. We have and . Then . Thus we obtain and , hence we are done by the induction hypothesis.
Case 2.1.1. We have and . Since and
[TABLE]
we are done by the induction hypothesis.
Case 2.1.2.1. We have and
[TABLE]
In addition,
[TABLE]
so .
Case 2.2.1. We have , and thus . Therefore, we are done by Claim 16 and by the induction hypothesis.
Cases 2.2.2.1 and 3.2.2.1. We have . Also, note that
[TABLE]
Thus, we have
[TABLE]
and
[TABLE]
Case 3.1.1. We have and . By Claim 13 and Claim 15,
[TABLE]
so we are done by Claim 16 and by the induction hypothesis.
Case 3.1.2.1. We have and
[TABLE]
In addition,
[TABLE]
Thus we get .
Case 3.2.1. We have , and so . In addition, by the induction hypothesis, we obtain . Thus we are done by Claim 16 and by the induction hypothesis.
Case 4.1.1. We have and . Since and
[TABLE]
we are done by the induction hypothesis.
Case 4.1.2.1. We have and . In addition,
[TABLE]
Case 4.2.1. We have and
[TABLE]
In addition, by the induction hypothesis,
[TABLE]
so we are done by the induction hypothesis.
Case 4.2.2.1. We have and . Note that
[TABLE]
Thus, we have
[TABLE]
Case 5.1.1. We have and . By Claim 13 and Claim 15,
[TABLE]
Thus, by the induction hypothesis, . In addition, by the above and by Claim 16,
[TABLE]
so by the induction hypothesis, .
Case 5.1.2.1. We have and . In addition,
[TABLE]
Case 5.2.1. We have . By the induction hypothesis,
[TABLE]
thus, again by the induction hypothesis, . In addition,
[TABLE]
Case 5.2.2.1. We have and . In addition,
[TABLE]
This concludes the proof of the lemma. ∎
Appendix B Toy examples
Appendix C Summary of the cases occurring in Algorithm 1
[TABLE]
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Ahmadian, U. Bhaskar, L. Sanità, and C. Swamy. Algorithms for inverse optimization problems. In 26th Annual European Symposium on Algorithms (ESA 2018) , volume 112 of Leibniz International Proceedings in Informatics, LIP Ics . Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2018.
- 2[2] R. K. Ahuja. The balanced linear programming problem. European Journal of Operational Research , 101(1):29–38, 1997.
- 3[3] K. Bérczi, L. M. Mendoza-Cadena, and K. Varga. Newton-type algorithms for inverse optimization I: weighted bottleneck Hamming distance and ℓ ∞ subscript ℓ \ell_{\infty} -norm objectives. ar Xiv preprint ar Xiv:2302.13411 , 2023.
- 4[4] D. Burton and P. L. Toint. On an instance of the inverse shortest paths problem. Mathematical Programming , 53:45–61, 1992.
- 5[5] P. M. Camerini, F. Maffioli, S. Martello, and P. Toth. Most and least uniform spanning trees. Discrete Applied Mathematics , 15(2–3):181–197, 1986.
- 6[6] M. Demange and J. Monnot. An introduction to inverse combinatorial problems. In Paradigms of Combinatorial Optimization: Problems and New Approaches , pages 547–586. John Wiley & Sons, Inc., second edition, 2014.
- 7[7] C. Duin and A. Volgenant. Minimum deviation and balanced optimization: A unified approach. Operations Research Letters , 10(1):43–48, 1991.
- 8[8] A. M. C. Ficker, F. C. R. Spieksma, and G. J. Woeginger. Robust balanced optimization. EURO Journal on Computational Optimization , 6(3):239–266, 2018.
