A bilevel optimization model for load balancing in mobile networks   through price incentives

Marianne Akian; Mustapha Bouhtou; Jean Bernard Eytard and; St\'ephane Gaubert

arXiv:1901.02363·math.OC·January 9, 2019·WiOpt

A bilevel optimization model for load balancing in mobile networks through price incentives

Marianne Akian, Mustapha Bouhtou, Jean Bernard Eytard and, St\'ephane Gaubert

PDF

Open Access

TL;DR

This paper introduces a bilevel optimization model using price incentives to balance load in mobile networks, employing advanced mathematical techniques and demonstrating effectiveness with real data to reduce congestion peaks.

Contribution

It develops a polynomial-time decomposition algorithm for a bilevel pricing model in mobile networks, integrating tropical geometry and discrete convexity methods.

Findings

01

Efficient load balancing reduces network congestion peaks.

02

The model performs well on real Orange network data.

03

The approach is scalable to large networks.

Abstract

We propose a model of incentives for data pricing in large mobile networks, in which an operator wishes to balance the number of connections (active users) of different classes of users in the different cells and at different time instants, in order to ensure them a sufficient quality of service. We assume that each user has a given total demand per day for different types of applications, which he may assign to different time slots and locations, depending on his own mobility, on his preferences and on price discounts proposed by the operator. We show that this can be cast as a bilevel programming problem with a special structure allowing us to develop a polynomial time decomposition algorithm suitable for large networks. First, we determine the optimal number of connections (which maximizes a measure of balance); next, we solve an inverse problem and determine the prices generating…

Figures10

Click any figure to enlarge with its caption.

Equations153

u_{k}^{a} \in {0, 1}^{T} max a \in [A] \sum t = 1 \sum T [ρ_{k}^{a} (t) + α_{k}^{a} y^{a, b} (t, L_{t}^{k})] u_{k}^{a} (t)

u_{k}^{a} \in {0, 1}^{T} max a \in [A] \sum t = 1 \sum T [ρ_{k}^{a} (t) + α_{k}^{a} y^{a, b} (t, L_{t}^{k})] u_{k}^{a} (t)

s.t. \forall a \in [A], t = 1 \sum T u_{k}^{a} (t) = R_{k}^{a}, \forall t \in [T], a \in [A] \sum u_{k}^{a} (t) \leq 1

s.t. \forall a \in [A], t = 1 \sum T u_{k}^{a} (t) = R_{k}^{a}, \forall t \in [T], a \in [A] \sum u_{k}^{a} (t) \leq 1

\forall t \in I_{k}^{a}, \forall a \in [A], u_{k}^{a} (t) = 0 .

s (N^{a, b}) = t = 1 \sum T a \in [A] \sum b \in [B] \sum k \in K^{b} \sum γ_{b} s_{L_{t}^{k}}^{a, b} (N (t, L_{t}^{k})) u_{k}^{a} (t)

s (N^{a, b}) = t = 1 \sum T a \in [A] \sum b \in [B] \sum k \in K^{b} \sum γ_{b} s_{L_{t}^{k}}^{a, b} (N (t, L_{t}^{k})) u_{k}^{a} (t)

= t = 1 \sum T a \in [A] \sum b \in [B] \sum k \in K^{b} \sum l = 1 \sum L γ_{b} s_{l}^{a, b} (N (t, l)) \mathds 1 (L_{t}^{k} = l) u_{k}^{a} (t)

= t = 1 \sum T l = 1 \sum L a \in [A] \sum b \in [B] \sum γ_{b} N^{a, b} (t, l) s_{l}^{a, b} (N (t, l))

y^{a, b} \in R_{+}^{T \times L} max t = 1 \sum T l = 1 \sum L a \in [A] \sum b \in [B] \sum γ_{b} N^{a, b} (t, l) s_{l}^{a, b} (N (t, l))

y^{a, b} \in R_{+}^{T \times L} max t = 1 \sum T l = 1 \sum L a \in [A] \sum b \in [B] \sum γ_{b} N^{a, b} (t, l) s_{l}^{a, b} (N (t, l))

y \in R_{+}^{T \times L} max t = 1 \sum T l = 1 \sum L N (t, l) s_{l} (N (t, l))

y \in R_{+}^{T \times L} max t = 1 \sum T l = 1 \sum L N (t, l) s_{l} (N (t, l))

u_{k} \in {0, 1}^{T} max t = 1 \sum T [ρ_{k} (t) + α_{k} y (t, L_{t}^{k})] u_{k} (t)

u_{k} \in {0, 1}^{T} max t = 1 \sum T [ρ_{k} (t) + α_{k} y (t, L_{t}^{k})] u_{k} (t)

s.t. t = 1 \sum T u_{k} (t) = R_{k}, \forall t \in I_{k}, u_{k} (t) = 0,

s.t. t = 1 \sum T u_{k} (t) = R_{k}, \forall t \in I_{k}, u_{k} (t) = 0,

u_{k} \in F_{k} max t, l \sum [ρ_{k} (t, l) + y (t, l)] u_{k} (t, l)

u_{k} \in F_{k} max t, l \sum [ρ_{k} (t, l) + y (t, l)] u_{k} (t, l)

y \in R_{+}^{T \times L} max t, l \sum f_{l} (N (t, l)) s.t. \forall (t, l) N (t, l) \leq N_{l}^{C}, N (t, l) = k = 1 \sum K u_{k} (t, l)

y \in R_{+}^{T \times L} max t, l \sum f_{l} (N (t, l)) s.t. \forall (t, l) N (t, l) \leq N_{l}^{C}, N (t, l) = k = 1 \sum K u_{k} (t, l)

u_{k} \in F_{k} max i = 1 \sum n [ρ_{k} (i) + y_{i}] u_{k} (i)

u_{k} \in F_{k} max i = 1 \sum n [ρ_{k} (i) + y_{i}] u_{k} (i)

y \in R_{+}^{n} max i = 1 \sum n f_{i} (N_{i}) s.t. \forall i, N_{i} \leq N_{i}^{C}, N_{i} = k = 1 \sum K u_{k}^{*} (i)

y \in R_{+}^{n} max i = 1 \sum n f_{i} (N_{i}) s.t. \forall i, N_{i} \leq N_{i}^{C}, N_{i} = k = 1 \sum K u_{k}^{*} (i)

\forall x \in [0, N_{i}^{C}], f_{i}^{''} (x) = x s_{i}^{''} (x) + 2 s_{i}^{'} (x) \leq 0 .

\forall x \in [0, N_{i}^{C}], f_{i}^{''} (x) = x s_{i}^{''} (x) + 2 s_{i}^{'} (x) \leq 0 .

t f_{i} (x) + (1 - t) f_{i} (y)

t f_{i} (x) + (1 - t) f_{i} (y)

= (t x + (1 - t) y) [\frac{t x}{t x + ( 1 - t ) y} s_{i} (x) + \frac{( 1 - t ) y}{t x + ( 1 - t ) y} s_{i} (y)]

\leq (t x + (1 - t) y) s_{i} (\frac{t x ^{2} + ( 1 - t ) y ^{2}}{t x + ( 1 - t ) y})

(t x + (1 - t) y)^{2}

(t x + (1 - t) y)^{2}

\leq t x^{2} + (1 - t) y^{2} .

s_{i} (\frac{t x ^{2} + ( 1 - t ) y ^{2}}{t x + ( 1 - t ) y}) \leq s_{i} (t x + (1 - t) y),

s_{i} (\frac{t x ^{2} + ( 1 - t ) y ^{2}}{t x + ( 1 - t ) y}) \leq s_{i} (t x + (1 - t) y),

t f_{i} (x) + (1 - t) f_{i} (y) \leq (t x + (1 - t) y) s_{i} (t x + (1 - t) y) = f_{i} (t x + (1 - t) y),

t f_{i} (x) + (1 - t) f_{i} (y) \leq (t x + (1 - t) y) s_{i} (t x + (1 - t) y) = f_{i} (t x + (1 - t) y),

u_{k} \in F_{k} max i \sum [ρ_{k} (i) + y_{i}] u_{k} (i) = u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ .

u_{k} \in F_{k} max i \sum [ρ_{k} (i) + y_{i}] u_{k} (i) = u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ .

P_{k} (y) = u_{k} \in F_{k} ⨁ i \in [n] ⨀ (ρ_{k} (i) ⊙ y_{i})^{⊙ u_{k} (i)}

P_{k} (y) = u_{k} \in F_{k} ⨁ i \in [n] ⨀ (ρ_{k} (i) ⊙ y_{i})^{⊙ u_{k} (i)}

u_{k}^{*} \in ar g u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ \Leftrightarrow u_{k}^{*} \in ar g u_{k} \in F_{k} max ⟨ ρ_{k} + y + β e, u_{k} ⟩

u_{k}^{*} \in ar g u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ \Leftrightarrow u_{k}^{*} \in ar g u_{k} \in F_{k} max ⟨ ρ_{k} + y + β e, u_{k} ⟩

ρ_{1}

ρ_{1}

ρ_{3}

ρ_{5}

P_{2} (y)

P_{2} (y)

P_{3} (y)

P_{4} (y)

P_{5} (y)

y \in R^{n} max i = 1 \sum n f_{i} (N_{i}) s.t. \forall i, N_{i} \leq N_{i}^{C},

y \in R^{n} max i = 1 \sum n f_{i} (N_{i}) s.t. \forall i, N_{i} \leq N_{i}^{C},

u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ .

u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ .

\left\{\begin{array}[]{c}u^{-}(i)=u(i)-\varepsilon\text{ and }u^{+}(i)=u(i)+\varepsilon\\ u^{-}(j)=u(j)+\varepsilon\text{ and }u^{+}(j)=u(j)-\varepsilon\\ u^{-}(k)=u^{+}(k)=u(k)\text{ otherwise }\end{array}\right.

\left\{\begin{array}[]{c}u^{-}(i)=u(i)-\varepsilon\text{ and }u^{+}(i)=u(i)+\varepsilon\\ u^{-}(j)=u(j)+\varepsilon\text{ and }u^{+}(j)=u(j)-\varepsilon\\ u^{-}(k)=u^{+}(k)=u(k)\text{ otherwise }\end{array}\right.

u_{k} \in F_{k} max ⟨ ρ_{k} + y, u_{k} ⟩ = u_{k} \in Δ_{k} sup ⟨ ρ_{k} + y, u_{k} ⟩ = u_{k} \in Δ_{k} sup ⟨ y, u_{k} ⟩ - φ_{k} (u_{k})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsICT Impact and Policies · Auction Theory and Applications · Advanced Queuing Theory Analysis

Full text

∎

11institutetext: M. Akian 22institutetext: INRIA, CMAP, Ecole Polytechnique, CNRS

Route de Saclay

91128 Palaiseau Cedex, France

Tel.: +331 69 33 46 39

22email: [email protected] 33institutetext: M. Bouhtou 44institutetext: Orange Labs

44, avenue de la République

92320 Chatillon, France

44email: [email protected] 55institutetext: J.B. Eytard 66institutetext: INRIA, CMAP, Ecole Polytechnique, CNRS

Route de Saclay

91128 Palaiseau Cedex, France

66email: [email protected] 77institutetext: S. Gaubert 88institutetext: INRIA, CMAP, Ecole Polytechnique, CNRS

Route de Saclay

91128 Palaiseau Cedex, France

Tel.: +331 69 33 46 03

88email: [email protected]

Preprint submitted to EURO Journal on Computational Optimization October 31, 2018

A bilevel optimization model for load balancing in mobile networks through price incentives

Marianne Akian

Mustapha Bouhtou

Jean Bernard Eytard

Stéphane Gaubert

Abstract

We propose a model of incentives for data pricing in large mobile networks, in which an operator wishes to balance the number of connections (active users) of different classes of users in the different cells and at different time instants, in order to ensure them a sufficient quality of service. We assume that each user has a given total demand per day for different types of applications, which he may assign to different time slots and locations, depending on his own mobility, on his preferences and on price discounts proposed by the operator. We show that this can be cast as a bilevel programming problem with a special structure allowing us to develop a polynomial time decomposition algorithm suitable for large networks. First, we determine the optimal number of connections (which maximizes a measure of balance); next, we solve an inverse problem and determine the prices generating this traffic. Our results exploit a recently developed application of tropical geometry methods to mixed auction problems, as well as algorithms in discrete convexity (minimization of discrete convex functions in the sense of Murota). We finally present an application on real data provided by Orange and we show the efficiency of the model to reduce the peaks of congestion.

Keywords:

Bilevel programming Mobile data networks Tropical geometry Discrete convexity Graph algorithms

1 Introduction

With the development of new mobile data technologies (3G, 4G), the demand for using the Internet with mobile phones has increased rapidly. Mobile service providers (MSP) have to confront congestion problems in order to guarantee a sufficient quality of service (QoS).

Several approaches have been developed to improve the quality of service, coming from different fields of the telecommunication engineering and economics. For instance, one can refer to Bonald and Feuillet bonald2013network for some models of performance analysis to optimize the network in order to improve the QoS. One of the promising alternatives to solve such problems consists in using efficient pricing schemes in order to encourage customers to shift their mobile data consumption. In maille2006pricing , Maillé and Tuffin describe a mechanism of auctions based on game-theoretic methods for pricing an Internet network, see also maille2014telecommunication . In altman2006pricing , Altman et al. study how to price different services by using a noncooperative game. These different approaches are based on congestion games. In the present work, we are interested in how a MSP can improve the QoS by balancing the traffic in the network. We wish to determine in which locations, and at which time instants, it is relevant to propose price incentives, and to evaluate the influence of these incentives on the quality of service.

This kind of problem belongs to smart data pricing. We refer the reader to the survey of Sen et al. sen2013survey and also to the collection of articles sen2014smart . Finding efficient pricing schemes is a revenue management issue. The first approach consists in usage-based pricing; the prices are fixed monthly by analysing the use of the former months. It is possible to improve this scheme by identifying peak hours and non-peak hours and proposing incentives in non-peak hours in order to decrease the demand at peak hours and to better use the network capacity at non-peak hours. This leads to time-dependent pricing. Such a scheme for mobile data is developed by Ha et al. in ha2012tube . The prices are determined at different time slots and based on the usage of the previous day in order to maximize the utility of the customers and the revenue of the MSP. This pricing scheme was concretely implemented by AT&T, showing the relevance of such a model. In another approach, Tadrous et al. propose a model in which the MSP anticipates peak hours and determines incentives for proactive downloads tadrous2013pricing .

The latter models concern only the time aspects. One must also take into account the spatial aspect in order to optimize the demand between the different locations. In ma2014time , Ma, Liu and Huang present a model depending on time and location of the customers where the MSP proposes prices and optimizes his profit taking into account the utility of the customers.

Here, we assume (as in ma2014time ) that the MSP proposes incentives at different time and places. Then, customers optimize their data consumption by knowing these incentives and the MSP optimizes a measure of the QoS. In this way, we introduce a bilevel model in which the provider proposes incentives in order to balance the traffic in the network and to avoid as much as possible the congestion (high level problem), and customers optimize their own consumption for the given incentives (low level problem).

Bilevel programs have been widely studied, see the surveys of Colson, Marcotte and Savard colson2007overview and of Dempe dempe2003bilevel . They represent an important class of pricing problems in the sense that they model a leader wanting to maximize his profit and proposing prices to some followers who maximize themselves their own utility. Most classes of bilevel programs are known to be NP-hard. Several methods have been introduced to solve such problems. For instance, if the low level program is convex, it can be replaced by its Karush-Kuhn-Tucker optimality conditions and the bilevel problem becomes a classical one-stage optimization problem, which is however generally non convex. If some variables are binary or discrete, and the objective function is linear, the global bilevel problem can be rewritten as a mixed integer program, as in Brotcorne et al. brotcorne2000bilevel .

In the present work, we optimize the consumption of each customer in a large area (large urban agglomeration) during typically one day divided in time slots of one hour, taking into account the different types of customers and of applications that they use. Therefore, we have to confront both with the difficulties inherent to bilevel programming and with the large number of variables (around $10^{7}$ ). Hence, we need to find polynomial time algorithms, or fast approximate methods, for classes of problems of a very large scale, which, if treated directly, would lead to mixed integer linear or nonlinear programming formulations beyond the capacities of current off-the-shelve solvers.

This motivated us to introduce a different approach, based on tropical geometry. Tropical geometry methods have been recently applied by Baldwin and Klemperer in baldwin2012tropical to an auction problem. This has been further developed by Yu and Tran tran2015product . In these approaches, the response of an agent to a price is represented by a certain polyhedral complex (arrangement of tropical hypersurfaces). This approach is intuitive since it allows one to vizualize geometrically the behavior of the agents: each cell of the complex corresponds to the set of incentives leading to a given response. Then, we vizualize the collective response of a group of customers by “superposing” (refining) the polyhedral complexes attached to every customer in this group. We apply here this idea to represent the response of the low-level optimizers in a bilevel problem. This leads to the following decomposition method: first we compute, among all the admissible consumptions of the customers, the one which maximizes a measure of balance of the network; then, we determine the price incentive which achieves this consumption. In this way, a bilevel problem is reduced to the minimization of a convex function over a certain Minkowski sum of sets. We identify situations in which the latter problem can be solved in polynomial time, by exploiting the discrete convexity results developed by Murota murota2003discrete . In this approach, a critical step is to check the membership of a vector to a certain Minkowski sum of sets of integer points of polytopes. In our present model, these polytopes, which represent the possible consumptions of one customer, have a remarkable combinatorial structure (they are hypersimplices). Exploiting this combinatorial structure, we show that this critical step can be performed quickly, by reduction to a shortest path problem in a graph. This leads to an exact solution method when there is only one type of contract and one type of application sensitive to price incentive, and to a fast approximate method in the general case.

We finally present the application of this model on real data from Orange and show how price incentives can improve the QoS by balancing the number of active customers in an urban agglomeration during one day. These results indicate that a price incentive mechanism can effectively improve the satisfaction of the users by displacing their consumption from the most loaded regions of the space-time domain to less loaded regions.

The paper is organized as follows. In Section 2, we present the bilevel model. In Section 3, we explain how a certain polyhedral complex can be used to represent the user’s responses, and we describe the decomposition method. In Section 4, we deal with the high level problem and identify special cases which are solvable in polynomial time. In Section 5, we develop accelerated algorithms which enable to solve bilevel problems with a large number of customers. In Section 6, we propose a general relaxation method. The application to the instance provided by Orange is presented in Section 7.

The first results of this article (without proofs) were published in the proceedings of the conference WiOpt 2017 eytard2017bilevel .

2 A bilevel model

We consider a time horizon of one day, divided in $T$ time slots numbered $t\in\left[T\right]=\{1,\dots T\}$ , and a network divided in $L$ different cells numbered $l\in\left[L\right]$ . We assume that $K$ customers, numbered $k\in\left[K\right]$ , are in the network. The customers have different types of contracts $b\in\left[B\right]$ and they make requests for different types of applications $a\in\left[A\right]$ (web/mail, streaming, download, …). We denote by $\mathcal{K}^{b}$ the set of customers with the contract $b$ . A given customer $k\in\mathcal{K}^{b}$ is characterized by the following data. We denote by $L_{t}^{k}\in\left[L\right]$ the position of the customer $k$ at each time $t\in\left[T\right]$ , so that the sequence $(L^{k}_{1},\ldots,L^{k}_{T})$ represents the trajectory of this customer. We assume that this trajectory is deterministic, so we consider customers with a regular daily mobility (for example, the trip between home and work). We denote by $\rho_{k}^{a}(t)$ the inclination of a customer $k$ to make a request for an application of type $a$ at time $t\in\left[T\right]$ . We suppose that customer $k$ wishes to make a fixed number of requests $R_{k}^{a}\leq T$ using the application $a$ during the day. We consider a set of time slots $\mathcal{I}_{k}^{a}\subset\left[T\right]$ in which the customer $k$ decides not to consume the application $a$ .

We denote by $u_{k}^{a}(t)$ the consumption of the customer $k$ for the application $a$ at time $t$ , setting $u_{k}^{a}(t)=1$ if $k$ is active at time $t$ and makes a request of type $a$ and $u_{k}^{a}(t)=0$ otherwise. Therefore, the number $N^{a,b}(t,l)$ of active customers with contract $b$ for the application $a$ at time $t$ and location $l$ is given by $N^{a,b}(t,l)=\sum_{k\in\mathcal{K}^{b}}{u_{k}^{a}(t)\mathds{1}(L_{t}^{k}=l)}$ , where $\mathds{1}$ denotes the indicator function, and the total number of active customers $N(t,l)$ at time $t$ and location $l$ is given by $N(t,l)=\sum_{a}{\sum_{b}{N^{a,b}(t,l)}}$ .

We consider the following two-stage model of price incentives. The first stage consists for the operator in announcing a discount $y^{a,b}(t,l)$ at time $t$ and location $l$ for the customers of contract $b$ making requests of type $a$ . We consider only nonnegative discounts, so $y^{a,b}(t,l)\geq 0$ . The second stage models the behavior of customers who modify their consumption by taking the discounts into account. We will assume the preference of a customer $k$ of contract $b$ for consuming at time $t$ becomes $\rho_{k}^{a}(t)+\alpha_{k}^{a}y^{a,b}(t,L_{t}^{k})$ , where $\alpha_{k}^{a}$ denotes the sensitivity of customer $k$ to price incentives for the application $a$ . It corresponds to classical linear utility functions, see e.g. baldwin2012tropical . We also assume that the customers cannot make more than one request at each time, that is $\forall t\in\left[T\right]$ , $\sum_{a}{u_{k}^{a}(t)}\leq 1$ . Therefore, each customer $k$ determines his consumptions $u_{k}^{a}=(u_{k}^{a}(t))_{t\in\left[T\right]}\in\{0,1\}^{T}$ for the applications, as an optimal solution of the linear program:

Problem 1 (Low-level, customers).

[TABLE]

Consequently, each price $y^{a,b}=(y^{a,b}(t,l))_{t\in\left[T\right],\;l\in\left[L\right]}$ determines the possible individual consumptions $u_{k}^{a}$ for the users with contract $b$ , and so the possible cumulated traffic vectors $N^{a,b}=(N^{a,b}(t,l))_{t\in\left[T\right],\;l\in\left[L\right]}$ and $N=\sum_{a}{\sum_{b}{N^{a,b}}}$ . The aim of the operator is, through price incentives, to balance the load in the network into the different locations and time slots to improve the quality of service perceived by each customer. We introduce a coefficient $\gamma_{b}$ relative to the kind of contracts of the different customers in order to favor some classes of premium customers. In lee2005non , Lee et al. suppose that the satisfaction of a customer depends on his perceived throughput, which can be considered as inversely proportional to the number of customers in the cell. Here, we assume that the satisfaction of each customer $k$ in the cell $l\in\left[L\right]$ is a nonincreasing function $s_{l}^{a,b}$ of the total number of active customers in the cell $N(t,l)$ , depending on the characteristics of the cell, of the type of application the user wants to do (some applications like streaming need a higher rate than others) and on the type of contract. We also assume that the satisfaction of all the customers with contract $b$ using a given application $a$ in a given cell is maximal until the number of active customers reaches a certain threshold $N_{l}^{a,b}$ , then $s_{l}^{a,b}(N(t,l))=1$ for $N(t,l)\leq N_{l}^{a,b}$ . After this threshold, the satisfaction decreases until a critical value $N_{l}^{C}$ . We add the constraint $\forall t\in\left[T\right],\;\forall l\in\left[L\right],\;N(t,l)\leq N_{l}^{C}$ to prevent the congestion. For non-real time services like web, mail, download, the satisfaction function can be viewed as a concave function of the throughput, like $1-e^{-\delta/\delta_{c}}$ where $\delta$ denotes the throughput, see Moety et al. moety2016satisfaction . Hence, we will consider that for contents like web, mail and download, $N_{l}^{a,b}=N_{l}^{1}$ , $s_{l}^{a,b}(n)=1$ for $n\leq N_{l}^{1}$ and $s_{l}^{a,b}(n)=1-\lambda_{b}\exp\left(-\frac{2N_{l}^{C}}{n-N_{l}^{1}}\right)$ for $N_{l}^{1}\leq n\leq N_{l}^{C}$ where $\lambda_{b}$ is a positive parameter depending on the kind of contract of the customer. The more expensive the contract of the customer is, the larger is $\lambda_{b}$ . We can prove that this function is concave for $0\leq n\leq N_{l}^{C}$ . For real time services like video streaming, the customers need a more important throughput to ensure a good QoS lee2005non . We will here consider the same type of functions $s_{l}^{a,b}$ but with $N_{l}^{1}$ replaced by $N_{l}^{a,b}=0$ , that is $s_{l}^{a,b}(n)=1-\lambda_{b}\exp\left(-\frac{2N_{l}^{C}}{n}\right)$ for $0<n\leq N_{l}^{C}$ .

So, the first stage consists in maximizing the global satisfaction function $s$ which depends on the vectors $N^{a,b}\in\mathbb{N}^{T\times L}$ and is defined by:

[TABLE]

with $\forall b\in\left[B\right],\gamma_{b}>0$ . Our final model consists in solving the following bilevel program:

Problem 2 (High-level, provider).

[TABLE]

where $\forall t\in\left[T\right],\;l\in\left[L\right],\;N(t,l)=\sum_{a=1}^{A}{\sum_{b=1}^{B}{N^{a,b}(t,l)}}$ , and $N(t,l)\leq N_{l}^{C}$ , $\forall t\in\left[T\right],\;l\in\left[L\right],\;a\in\left[A\right],\;b\in\left[B\right],\;N^{a,b}(t,l)=\sum_{k\in\mathcal{K}^{b}}{u_{k}^{a}(t)\mathds{1}(L_{t}^{k}=l)}$ , and $\forall k\in\left[K\right]$ , the vectors $u_{k}^{a}$ are solutions of Problem 1.

3 A decomposition approach for solving the first model

We will present a decomposition method for solving the previous bilevel problem. In this section, and in the next two ones, we suppose that there is only one kind of application and one kind of contract. This special case is already relevant in applications: it covers the case when, for instance, only the download requests are influenced by price incentives, whereas other requests like streaming or web are fixed. Whereas the analytical results of the present section carry over to the general model, the results of the next two sections (polynomial time solvability) are only valid under these restrictive assumptions. We shall return to the general case in Section 6, developing a fast approximate algorithm for the general model based on the present principles.

In the above special case, the bilevel model can be rewritten:

[TABLE]

where $\forall t,l\;N(t,l)\leq N_{l}^{C}$ and $N(t,l)=\sum_{k\in\left[K\right]}{u_{k}^{*}(t)\mathds{1}(L_{t}^{k}=l)}$ , and for each $k\in\left[K\right]$ the vectors $u_{k}^{*}$ are solutions of the problem:

[TABLE]

In order to deal more abstractly with the bilevel model, we introduce the notation $u_{k}(t,l)=u_{k}(t)\mathds{1}(L_{t}^{k}=l)$ . Hence, we have $u_{k}(t,l)=0$ if $L_{t}^{k}\neq l$ . By defining the set $\mathcal{J}_{k}=\{(t,l)\;\mid\;t\in\mathcal{I}_{k}$ or $L_{t}^{k}\neq l\}$ , we have that $(t,l)\in\mathcal{J}_{k}$ implies that $u_{k}(t,l)=0$ . We can then define $\rho_{k}(t,l)=\rho_{k}(t)/\alpha_{k}$ if $(t,l)\notin\mathcal{J}_{k}$ and $\rho_{k}(t,l)=-\infty$ otherwise. Then, we can rewrite each low-level problem as:

[TABLE]

where $F_{k}=\{u\in\{0,1\}^{T\times L}\mid\sum_{t,l}u(t,l)=R_{k}\;\text{and}\;\forall(t,l)\in\mathcal{J}_{k},u(t,l)=0\}$ , and the global bilevel problem becomes:

[TABLE]

with $f_{l}:x\in\mathbb{R}_{+}\mapsto xs_{l}(x)$ . Notice that the set $\mathcal{J}_{k}$ corresponds to the set of couples $(t,l)$ such that $\rho_{k}(t,l)=-\infty$ . It is possible to enumerate all the couples $(t,l)\in\left[T\right]\times\left[L\right]$ . Let us define $n=T\times L$ and associate each couple $(t,l)$ to an integer $i\in\left[n\right]$ . The quantities $\rho_{k}(t,l)$ , $u_{k}(t,l)$ , $N(t,l)$ and $y(t,l)$ can be respectively denoted by $\rho_{k}(i)$ , $u_{k}(i)$ , $N_{i}$ and $y_{i}$ . The function $f_{l}$ and the integer $N_{l}^{C}$ can be respectively denoted by $f_{i}$ and $N_{i}^{C}$ . It means that for two indices $i$ and $j$ associated to two couples $(t,l)$ and $(t^{\prime},l)$ with the same $l$ , we have $f_{i}=f_{j}:=f_{l}$ and $N_{i}^{C}=N_{j}^{C}:=N_{l}^{C}$ . The low-level problem can be rewritten:

Problem 3 (Abstract low-level problem).

[TABLE]

where $F_{k}=\{u\in\{0,1\}^{n}|\sum_{i=1}^{n}u(i)=R_{k}$ and $\forall i\in\mathcal{J}_{k},u(i)=0\}$ .

The global bilevel problem is:

Problem 4 (Bilevel problem).

[TABLE]

with for all $k\in\left[K\right]$ , $u_{k}^{*}$ solution of Problem 3.

Lemma 1.

Suppose that the functions $s_{i}$ are nonincreasing and concave on $\left[0,N_{i}^{C}\right]$ . Then, the functions $f_{i}$ are also concave on $\left[0,N_{i}^{C}\right]$ .

Proof.

The result comes easily if we suppose that the functions $s_{i}$ are twice differentiable, because we have:

[TABLE]

We could deduce that the same is true without the differentiability assumption by a density argument, writing a concave function as a pointwise limit of smooth concave functions. However, we prefer to provide the following elementary argument. Consider $0\leq x\leq y\leq N_{i}^{C}$ and $t\in\left[0,1\right]$ . Because $s_{i}$ is nonincreasing, we have $s_{i}(x)\geq s_{i}(y)$ . We have:

[TABLE]

Because of the well-known inequality $2xy\leq x^{2}+y^{2}$ , we have:

[TABLE]

Then, because $s_{i}$ is nonincreasing, we have:

[TABLE]

so that:

[TABLE]

and $f_{i}$ is concave. ∎

3.1 A tropical representation of customers’ response

The lower-level component of our bilevel problem can be studied thanks to tropical techniques. Tropical mathematics refers to the study of the max-plus semifield $\mathbb{R}_{\max}$ , that is the set $\mathbb{R}\cup\{-\infty\}$ endowed with two laws $\oplus$ and $\odot$ defined by $a\oplus b=\max(a,b)$ and $a\odot b=a+b$ , see bcoq ; itenberg2009tropical ; butkovicbook ; MacLaganSturmfels for background. We first consider the relaxation in which the price vector $y$ can take any real value, i.e. $y\in\mathbb{R}^{n}$ . Each customer $k$ defines his consumption $u_{k}^{*}$ by solving the problem:

[TABLE]

The map $P_{k}:y\mapsto\max_{u_{k}\in F_{k}}\langle\rho_{k}+y,u_{k}\rangle$ is convex, piecewise affine, and the gradients of its linear parts are integer valued. It can be thought of as a tropical polynomial function in the variable $y$ . Indeed, with the tropical notation, we have

[TABLE]

where $z^{\odot p}:=z\odot\dots\odot z=p\times z$ denotes the $p$ th tropical power. In this way, we see that all the monomials of $P_{k}$ have degree $\sum_{i}{u_{k}(i)}=R_{k}$ , so that $P_{k}$ is homogeneous of degree $R_{k}$ , in the tropical sense. This remark leads to the following lemma:

Lemma 2.

Denote by $e=(1\dots 1)\in\mathbb{R}^{n}$ . Let $y$ be a solution of the relaxation $y\in\mathbb{R}^{n}$ of Problem 4. Then, for all $\beta\in\mathbb{R}$ , $y+\beta e$ is a solution of the relaxation $y\in\mathbb{R}^{n}$ of Problem 4.

Proof.

Consider a solution $y\in\mathbb{R}^{n}$ of the relaxed problem. Because $P_{k}$ is homogeneous of degree $R_{k}$ , we have for all $\beta\in\mathbb{R}^{n}$ , $P_{k}(y+\beta e)=P_{k}(y)+\beta R_{k}$ . In particular:

[TABLE]

Hence, $y+\beta e$ leads to the same repartition of the customers $N^{*}$ and corresponds also to an optimal solution of the relaxed bilevel problem. ∎

Corollary 1.

The bilevel problem 4 has the same value as its relaxation $y\in\mathbb{R}^{n}$ .

Proof.

Consider a solution $y^{*}\in\mathbb{R}^{n}$ of the relaxed problem, and take $\beta\geq-\min_{i}y^{*}_{i}$ . Then, we have $y^{*}+\beta e\in R_{+}^{n}$ and solution of the relaxed problem according to Lemma 2. Consequently, $y^{*}+\beta e$ is a solution of Problem 4. ∎

By definition, the tropical hypersurface associated to a tropical polynomial function is the nondifferentiability locus of this function. Since the monomial $P_{k}$ is homogeneous, its associated tropical hypersurface is invariant by the translation by a constant vector. Therefore, it can be represented as a subset of the tropical projective space $\mathbb{T}\mathbb{P}^{n-1}$ . The latter is defined as the quotient of $\mathbb{R}^{n}$ by the equivalence relation which identifies two vectors which differ by a constant vector, and it can be identified to $\mathbb{R}^{n-1}$ by the map

$\mathbb{T}\mathbb{P}^{n-1}\to\mathbb{R}^{n-1}$ , $y\mapsto(y_{i}-y_{n})_{i\in\left[n-1\right]}$ .

Example 1.

Consider a simple example with $T=3$ time steps (for instance morning, afternoon and evening), $L=1$ (that is $n=3$ ), $K=5$ and $\mathcal{J}_{k}=\emptyset$ for each $k$ . The parameters of the customers are

[TABLE]

The tropical polynomial of the first customer is $P_{1}(y)=\max\left(y_{1},y_{2},y_{3}\right)$ , meaning that this customer has no preference and consumes when the incentive is the best. Its associated tropical hypersurface is a tropical line (since $P_{1}$ has degree $1$ ), so it splits $\mathbb{T}\mathbb{P}^{2}$ in three different regions corresponding to a choice of the vector $u_{1}$ among $(1,0,0)$ , $(0,1,0)$ and $(0,0,1)$ , see Figure 2. E.g., the cell labeled by $(1,0,0)$ represents a consumption concentrated the morning, induced by a price $y_{1}>y_{2}$ and $y_{1}>y_{3}$ .

To study jointly the responses of the five customers, we represent the arrangement of the tropical hypersurfaces associated to the $P_{k},\;k\in\left[5\right]$ (see Figure 3), with

[TABLE]

Lemma 3 (Corollary of (tran2015product, , §4, Lemma 3.1)).

Each cell of the arrangement of tropical hypersurfaces corresponds to a collection of customers responses $(u_{1},...,u_{K})$ and to a unique traffic vector $N$ , defined by $N=\sum_{k}{u_{k}}$ .

3.2 Decomposition theorem

We next show that the present bilevel problem can be solved by decomposition. We note that the function to optimize for the higher level problem, i.e. the optimization problem of the provider, depends only on $N$ . The variables $y_{i}$ allow one to generate the different possible vectors $N$ .

Definition 1.

A vector $N\in\mathbb{Z}^{n}$ is said to be feasible if there exists $K$ vectors $u_{1}^{*},\dots,u_{K}^{*}$ such that $N=\sum_{k=1}^{K}u_{k}^{*}$ and there exists $y\in\mathbb{R}^{n}$ such that for each $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y,u_{k}\rangle$ .

So, we will characterize the feasible vectors $N$ in order to optimize directly the satisfaction function on the set of feasible $N$ . We define the relaxation of Problem 4 to the case $y\in\mathbb{R}^{n}$ .

Problem 5 (Bilevel problem with real discounts).

[TABLE]

with $N=\sum_{k=1}^{K}u_{k}^{*}$ and for all $k\in\left[K\right]$ , $u_{k}^{*}$ solution of:

[TABLE]

According to Lemma 1, Problem 4 has the same value than the relaxation problem 5. Moreover, according to Lemma 2, if $(y^{*},N^{*})$ is an optimal solution of Problem 5, then $(y^{*}+\beta e,N^{*})$ is also an optimal solution of Problem 5 for every $\beta\in\mathbb{R}$ . We recall that $e\in\mathbb{R}^{n}$ is a vector defined by $e^{T}=(1,\dots,1)$ . Then, if we find an optimal solution $(y^{*},N^{*})$ of Problem 5, then $(y^{*}+\beta e,N^{*})$ with $\beta=-\min_{i\in\left[n\right]}y^{*}_{i}$ is a solution of Problem 5 such that $y^{*}+\beta e\in\mathbb{R}_{+}^{n}$ . Consequently, $(y^{*}+\beta e,N^{*})$ is a solution of Problem 4. Hence, a solution of Problem 5 (with real discounts) provides a solution of Problem 4 (with nonnegative discounts). In the sequel, we will study the bilevel problem 5.

Most of the following results are applications of classical notions of convex analysis which can be found in rockafellar1970convex . It is convenient to introduce the convex characteristic function $\chi_{A}$ of a set $A\subset\mathbb{R}^{n}$ , defined by $\chi_{A}(x)=0$ if $x\in A$ , and $\chi_{A}(x)=+\infty$ otherwise. If $A$ is a convex set, then $\chi_{A}$ is a convex function. We define also for every $k$ the polytope $\Delta_{k}$ as the convex hull of $F_{k}$ , together with the convex function $\varphi_{k}$ defined by $\varphi_{k}(u)=-\langle\rho_{k},u\rangle+\chi_{\Delta_{k}}(u)$ .

Lemma 4.

$\Delta_{k}=\{u\in\left[0,1\right]^{n}|\sum_{i=1}^{n}u(i)=R_{k}$ * and $\forall i\in\mathcal{J}_{k},u(i)=0\}$ and $F_{k}$ is exactly the set of vertices of $\Delta_{k}$ .*

Proof.

Let us define the polytope $\Delta_{k}^{\prime}=\{u\in\left[0,1\right]^{n}\mid\sum_{i=1}^{n}u(i)=R_{k}$ and $\forall(t,l)\in\mathcal{J}_{k},u(i)=0\}$ . Clearly, $F_{k}\subset\Delta_{k}^{\prime}$ . Then, $\Delta_{k}\subset\Delta_{k}^{\prime}$ .

Consider a point $u$ of $\Delta_{k}^{\prime}$ which is not in $F_{k}$ . There exists an index $i$ such that $0<u(i)<1$ . In particular $u(i)\notin\mathbb{N}$ . However, $\sum_{i}u(i)=R_{k}\in\mathbb{N}$ . So, there exists another index $j$ such that $0<u(j)<1$ . Hence, there exists $\varepsilon>0$ such that the points $u^{-}$ ans $u^{+}$ defined by:

[TABLE]

are in $\Delta_{k}^{\prime}$ . Because $x=\frac{x^{-}+x^{+}}{2}$ with $x\neq x^{-}$ and $x\neq x^{+}$ , $x$ is not a vertex of $\Delta_{k}^{\prime}$ . Consequently, the set of vertices of $\Delta_{k}^{\prime}$ is included in $F_{k}$ . Because $\Delta_{k}^{\prime}$ is the convex hull of its vertices, we have $\Delta_{k}^{\prime}\subset\Delta_{k}$ .

The polytope $\Delta_{k}$ is such that $\Delta_{k}=\{u\in\mathbb{R}^{n}\mid 0\leq u\leq a\;\text{and}\;e^{T}u=R_{k}\}$ , with $a(i)=0$ if $i\in\mathcal{J}_{k}$ and $a(i)=1$ otherwise, and $e^{T}=(1,\dots,1)\in\mathcal{M}_{1,n}(\mathbb{R})$ . Then, because $e^{T}$ is a totally unimodular matrix, the vertices of $\Delta_{k}$ are exactly its integer points, that is $F_{k}$ . ∎

Corollary 2.

The value of each low level problem 3 is the value of the Legendre-Fenchel transform of $\varphi_{k}$ at point $y$ , i.e. $\varphi_{k}^{*}(y)=\sup_{u_{k}\in\Delta_{k}}\left[\langle y,u_{k}\rangle-\varphi_{k}(u_{k})\right]$ .

Proof.

The vertices of $\Delta_{k}$ are $F_{k}$ . Hence:

[TABLE]

∎

We want to characterize the feasible vectors. We have first the following result.

Lemma 5.

Let $N$ be a real vector. Then, there exists $y\in\mathbb{R}^{n}$ and $u_{1}^{*},\dots,u_{K}^{*}$ such that $N=\sum_{k\in\left[K\right]}u_{k}^{*}$ and for every $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in\Delta_{k}}\langle\rho_{k}+y,u_{k}\rangle$ if and only if $N\in\sum_{k\in\left[K\right]}\Delta_{k}$ .

Proof.

Such vectors $u_{k}^{*}$ belong to $\Delta_{k}$ , so $N\in\sum_{k\in\left[K\right]}\Delta_{k}$ .

Let $k\in\left[K\right]$ and $y\in\mathbb{R}^{n}$ . A vector $u_{k}^{*}\in\Delta_{k}$ is such that $u_{k}^{*}\in\arg\max_{u_{k}\in\Delta_{k}}\langle\rho_{k}+y,u_{k}\rangle$ if and only if $u_{k}^{*}\in\partial\varphi_{k}^{*}(y)$ , where $\partial\varphi_{k}^{*}$ denotes the subdifferential of the convex function $\varphi_{k}^{*}$ . Then, a vector $N=\sum_{k}u_{k}^{*}$ if and only if $N\in\sum_{k}\partial\varphi_{k}^{*}(y)$ . By (rockafellar1970convex, , Th. 23.8), $\sum_{k}\partial\varphi_{k}^{*}(y)=\partial\left(\sum_{k}{\varphi_{k}^{*}}\right)(y)=\partial\psi^{*}(y)$ , where $\psi=\underset{k}{\square}\varphi_{k}$ is the inf-convolution of the functions $\varphi_{k}$ .

Let $N$ be a real vector. Then, there exists $y\in\mathbb{R}^{n}$ and $u_{1}^{*},\dots,u_{K}^{*}$ such that $N=\sum_{k\in\left[K\right]}u_{k}^{*}$ and for every $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in\Delta_{k}}\langle\rho_{k}+y,u_{k}\rangle$ if and only if $N\in\partial\psi^{*}(y)$ , or equivalenty $y\in\partial\psi(N)$ (because $\psi$ is convex), that is if and only if $\partial\psi(N)\neq\emptyset$ . The function $\psi$ is polyhedral (as the inf-convolution of polyhedral convex functions) and it is finite at every point in $\sum_{k}{\Delta_{k}}$ . So, $\forall N^{\prime}\in\sum_{k}{\Delta_{k}},\partial\psi(N^{\prime})$ is a non-empty polyhedral convex set (rockafellar1970convex, , Th. 23.10). The result comes straightforwardly. ∎

It is now possible to characterize the feasible vectors.

Lemma 6.

A vector $N\in\mathbb{Z}^{n}$ is feasible if and only if $N\in\sum_{k}F_{k}$ .

Proof.

According to Definition 1, a vector $N\in\mathbb{Z}^{n}$ is feasible if and only if there exists $y\in\mathbb{R}^{n}$ and $K$ vectors $(u_{k}^{*})_{k\in\left[K\right]}$ such that $N=\sum_{k}u_{k}^{*}$ and $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y,u_{k}\rangle$ . As a consequence of Lemma 4, $\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y,u_{k}\rangle=\arg\max_{u_{k}\in\Delta_{k}}\langle\rho_{k}+y,u_{k}\rangle$ . Then, by Lemma 5, a vector $N\in\mathbb{Z}^{n}$ is feasible if and only if $N\in(\sum_{k\in\left[K\right]}\Delta_{k})\cap\mathbb{Z}^{n}$ . We have now to prove $\sum_{k\in\left[K\right]}F_{k}=(\sum_{k\in\left[K\right]}\Delta_{k})\cap\mathbb{Z}^{n}$ . Because $F_{k}=\Delta_{k}\cap\mathbb{Z}^{n}$ , the inclusion $\sum_{k\in\left[K\right]}F_{k}\subset(\sum_{k\in\left[K\right]}\Delta_{k})\cap\mathbb{Z}^{n}$ is obvious. Conversely, consider $N\in(\sum_{k\in\left[K\right]}\Delta_{k})\cap\mathbb{Z}^{n}$ . Then, the set $\Delta_{N}=\{(u_{1},\dots,u_{K})\in\Delta_{1}\times\dots\times\Delta_{K}\mid\sum_{k=1}^{K}u_{k}=N\}$ is a non-empty polytope. A vector $u=(u_{1},\dots,u_{K})$ belongs to $\Delta_{N}$ if it satisfies the following constraints:

[TABLE]

, that is $\Delta_{N}=\{u\in\mathbb{R}^{Kn}\mid 0\leq u\leq a,Au=b\}$ , with $a\in\mathbb{R}^{Kn}$ such that $a_{k}(i)=\mathds{1}_{i\in\mathcal{J}_{k}}$ for every $i\in\left[n\right],k\in\left[K\right]$ , and $A\in\mathcal{M}_{K+n,Kn}(\mathbb{Z})$ and $b\in\mathbb{Z}^{K+n}$ defined by:

[TABLE]

By Poincaré’s lemma, $A$ is totally unimodular. In particular, the extreme points of $\Delta_{N}$ are integer. Then, there exists $(u_{1}^{*},\dots,u_{K}^{*})$ with for every $k\in\left[K\right]$ , $u_{k}^{*}\in\Delta_{k}\cap\mathbb{Z}^{n}=F_{k}$ such that $N=\sum_{k}u_{k}^{*}$ . ∎

Each vector $N\in\sum_{k}F_{k}$ can be written as sum of vectors $u_{k}^{*}\in F_{k}$ for $k\in\left[K\right]$ such that there exists $y\in\mathbb{R}^{n}$ with $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y,u_{k}\rangle$ . In order to determine such vectors $u_{k}^{*}$ , we have the following lemma:

Lemma 7.

Let $N=\sum_{k}u_{k}^{*}$ with $u_{k}^{*}\in\Delta_{k}\;\forall k$ . The following assertions are equivalent:

There exists $y\in\mathbb{R}^{n}$ such that for each $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in\Delta_{k}}\langle\rho_{k}+y,u_{k}\rangle$ . 2. 2.

The vectors $u_{1}^{*},\dots,u_{K}^{*}$ realize the minimum in the inf-convolution $\psi$ , i.e.

[TABLE]

.

Proof.

(1) $\Rightarrow$ (2) : We have for every $k$ :

[TABLE]

By summing those equalities, we have:

[TABLE]

By considering only the vectors $u_{1}\in\Delta_{1},\dots,u_{K}\in\Delta_{K}$ such that $\sum_{k}u_{k}=N$ , we can write $\sum_{k}\langle\rho_{k},u_{k}^{*}\rangle=\underset{\begin{subarray}{c}u_{1}\in\Delta_{1},\dots,u_{K}\in\Delta_{K}\\ \sum_{k}{u_{k}}=N\end{subarray}}{\sup}\sum_{k}\langle\rho_{k},u_{k}\rangle$ which is exactly the second assertion.

(2) $\Rightarrow$ (1): The set $\partial\psi(N)$ is non-empty. Consider $y\in\partial\psi(N)$ , that is $N\in\partial\psi^{*}(y)$ . We can write:

[TABLE]

So:

[TABLE]

Consequently, if one $u_{k}^{*}$ is not an optimal solution of the low-level problem, the previous equality cannot be true. ∎

The high-level problem of Problem 5 consists in maximizing a function depending only on a vector $N$ which has to be a feasible vector. It is now possible to write the main theorem of this section, which establishes a decomposition method for solving Problem 5.

Theorem 3.1.

(Decomposition)* The bilevel problem 5 can be solved as follows:*

Find an optimal solution $N^{*}$ to the high level problem with unknown $N$ :

[TABLE] 2. 2.

Find vectors $(u_{1}^{*},\dots,u_{K}^{*})$ solutions of the following problem:

[TABLE] 3. 3.

Find a vector $y^{*}$ such that $\forall k$ , $u_{k}^{*}$ is a solution of the low level problem.

Proof.

The bilevel programming problem 5 can be rewritten $\max_{N\text{ feasible}}\sum_{i}f_{i}(N_{i})$ subject to $\forall i\in\left[n\right],\;N_{i}\leq N_{i}^{C}$ . According to Lemma 6, $N$ is feasible if and only if $N\in\sum_{k}{F_{k}}$ . So, a necessary condition for a vector $y^{*}$ to be an optimal solution of the bilevel problem is that for every $k$ , there exists $u_{k}^{*}\in\partial\varphi_{k}^{*}(y^{*})$ such that $N^{*}=\sum_{k}{u_{k}^{*}}$ is an optimal solution of the problem:

[TABLE]

After finding $N^{*}$ , it is possible to find $u_{k}^{*}\in\partial\varphi_{k}^{*}(y^{*})$ by solving the inf-convolution problem as a consequence of Lemma 7. Because $u_{k}^{*}\in\partial\varphi_{k}^{*}(y^{*})$ is equivalent to $y^{*}\in\partial\varphi_{k}(u_{k}^{*})$ , each point of $\bigcap_{k}\;\partial\varphi_{k}(u_{k}^{*})$ is an optimal solution of the bilevel problem.

∎

The second step of this theorem consists in solving a linear program. We next show that the third step reduces to a linear feasibility problem.

Lemma 8.

Let $N\in\mathbb{Z}^{n}$ be a feasible vector and $u_{k}^{*}\in F_{k}$ ( $k\in\left[K\right]$ ) be vectors such that $N=\sum_{k}u_{k}^{*}$ and $\psi(N)=-\sum_{k}\langle\rho_{k},u_{k}^{*}\rangle$ . Then, the set of vectors $y^{*}\in\mathbb{R}^{n}$ such that $\forall k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y^{*},u_{k}\rangle$ is non-empty and is the polytope defined by the following inequalities:

[TABLE]

Proof.

According to Lemma 7, there exists $y^{*}\in\bigcap_{k}\partial\varphi_{k}(u_{k}^{*})$ . Hence, we have $\forall u_{k}\in F_{k}$ , $\langle\rho_{k}+y^{*},u_{k}^{*}\rangle\geq\langle\rho_{k}+y^{*},u_{k}\rangle$ .

Consider indices $i,j\notin\mathcal{J}_{k}$ with $u_{k}^{*}(i)=1$ , $u_{k}^{*}(j)=0$ , and the vector $u_{k}$ defined by $u_{k}(i)=0$ , $u_{k}(j)=1$ and $\forall l\neq i,j,\;u_{k}(l)=u_{k}^{*}(l)$ . We verify easily $u_{k}\in F_{k}$ , so that the condition $\langle\rho_{k}+y^{*},u_{k}^{*}\rangle\geq\langle\rho_{k}+y^{*},u_{k}\rangle$ , which can be rewritten $\rho_{k}(i)+y^{*}_{i}\geq\rho_{k}(j)+y^{*}_{j}$ , is satisfied.

Moreover, this condition is sufficient. Consider $y^{*}$ such that $\forall i,j\notin\mathcal{J}_{k}$ with $u_{k}^{*}(i)=1$ , $u_{k}^{*}(j)=0$ , we have $\rho_{k}(i)+y^{*}(i)\geq\rho_{k}(j)+y^{*}(j)$ . Consider $u_{k}\in F_{k}$ . By definition of $F_{k}$ , the quantity $\langle\rho_{k}+y^{*},u_{k}\rangle$ corresponds to the sum of $R_{k}$ coordinates of $\rho_{k}+y^{*}$ for which the index is not in $\mathcal{J}_{k}$ . Hence,

[TABLE]

because of the lemma hypothesis and because $\#\{j|u_{k}(j)=1,u_{k}^{*}(j)=0\}=\#\{j|u_{k}(j)=0,u_{k}^{*}(j)=1\}$

∎

For every $k$ , the latter inequalities define a polytope, and we have to find $y^{*}$ in the intersection of all these polytopes.

4 A first algorithm

In this section, we explain how the decomposition method provided by Theorem 3.1 leads to a polynomial time algorithm for solving Problem 5. We will use some elements of discrete convexity developed by Danilov, Koshevoy danilov2004discrete and Murota murota2003discrete , that we recall first. We next explain how to solve Problem 5.

An integer set $B\subset\mathbb{Z}^{n}$ is $M$ -convex (murota2003discrete, , Ch. 4, p.101) if $\forall x,y\in B,\forall i\in[n]$ such that $x_{i}>y_{i},\exists j\in[n]$ such that $x_{j}<y_{j}$ , $x-e_{i}+e_{j}\in B$ and $y+e_{i}-e_{j}\in B$ , where $e_{i}$ is the $i$ -th vector of the canonical basis in $\mathbb{R}^{n}$ .

Lemma 9.

The feasible domain of the high-level program

[TABLE]

is a $M$ -convex set of $\mathbb{Z}^{n}$ .

Proof.

We can check easily that $\forall k$ , the set $F_{k}$ is $M$ -convex. Taking two different vectors $u_{k}$ and $v_{k}$ in $F_{k}$ , there exist $i,j$ such that $u_{k}(i)=1,v_{k}(i)=0$ and $u_{k}(j)=0,v_{k}(j)=1$ . These indices $i,j$ do not belong to $\mathcal{J}_{k}$ . The vectors $u_{k}-e_{i}+e_{j}$ and $v_{k}+e_{i}-e_{j}$ have coordinates in $\{0,1\}$ with a sum equal to $R_{k}$ and all coordinates in $\mathcal{J}_{k}$ equal to 0.

It is known that a Minkowski sum of $M$ -convex sets is $M$ -convex (murota2003discrete, , Th. 4.23, p.115), and so the set $\sum_{k}{F_{k}}$ is $M$ -convex.

Finally, consider two vectors $N$ and $N^{\prime}$ of $B$ . They belong to $\sum_{k}{F_{k}}$ , so for each $i$ with $N_{i}>N^{\prime}_{i}$ , we can find $j$ with $N_{j}<N^{\prime}_{j}$ such that $N-e_{i}+e_{j}$ and $N^{\prime}+e_{i}-e_{j}$ are in $\sum_{k}{F_{k}}$ . The $i$ -th coordinate of $N-e_{i}+e_{j}$ is $N_{i}-1<N_{i}\leq N_{l}^{C}$ and the $j$ -th coordinate of $N-e_{i}+e_{j}$ is $N_{j}+1\leq N^{\prime}_{j}\leq N_{j}^{C}$ . So $N-e_{i}+e_{j}\in B$ and similarly $N^{\prime}+e_{i}-e_{j}\in B$ , which proves the $M$ -convexity of $B$ .

∎

A function $g:\mathbb{Z}^{n}\mapsto\mathbb{R}$ is $M$ -convex (murota2003discrete, , ch. 6.1, p.133) if $\forall x,y\in\mathbb{Z}^{n}$ such that $g(x)$ and $g(y)$ are finite real values, $\forall i\in\left[n\right]$ such that $x_{i}>y_{i}$ , $\exists j\in\left[n\right]$ such that $x_{j}<y_{j}$ and the following condition holds true:

[TABLE]

A function $g$ is $M$ -concave if $-g$ is $M$ -convex. It follows from this definition that if $B$ is a $M$ -convex set, then $\chi_{B}$ is a $M$ -convex function (we recall that $\chi_{B}:\mathbb{Z}^{n}\mapsto\mathbb{R}$ is defined by $\chi_{B}(x)=0$ if $x\in B$ and $\chi_{B}(x)=+\infty$ otherwise). An important property of $M$ -convex functions is that local optimality guarantees global optimality (murota2003discrete, , Th. 6.26, p.148) in the following sense. Let $g$ be a $M$ -convex function and $x\in\mathbb{Z}^{n}$ . Then $g(x)=\min_{y\in\mathbb{Z}^{n}}g(y)$ if and only if $\forall i,j\in\left[n\right],\;g(x)\leq g(x-e_{i}+e_{j})$ .

According to Theorem 3.1, we have to solve $\max_{N\in\mathbb{Z}^{n}}f(N)-\chi_{B}(N)$ , where $f:N\mapsto\sum_{i}f_{i}(N_{i})$ is a separable concave function, and $B$ is the $M$ -convex set introduced in Lemma 9. The function $f-\chi_{B}$ is $M$ -concave (murota2003discrete, , Th. 6.13.(4), p.143). Then, we have the following result as a direct consequence of (murota2003discrete, , Th. 6.26, p.148) :

Theorem 4.1.

Let $N^{*}\in B$ . Then, $N^{*}$ is a maximum point of $f$ over $B$ if and only if $\forall i,j\in\left[n\right]$ such that $N^{*}-e_{i}+e_{j}\in B,f(N^{*}-e_{i}+e_{j})\leq f(N^{*})$ .

Moreover, Murota (murota2003discrete , ch.10, p.281) gives an algorithm which runs in pseudo-polynomial time to minimize $M$ -convex functions (see Algorithm 1).

By adding a priority rule in Step 2 of Algorithm 1 in the case where $\operatorname*{\arg\,\min}_{k,l\in\left[n\right]}f(x-e_{k}+e_{l})$ is not reduced to a single point, a global minimizer of $f$ is obtained by Algorithm 1in pseudo-polynomial time.

Proposition 1 (murota2003discrete , Prop.10.2).

Assume that $\operatorname{dom}f$ is bounded. Let $F$ be the number of arithmetic operations needed to evaluate $f$ and $K_{1}=\max(||x-y||_{1}\mid x,y\in\operatorname{dom}f)$ . Then, if a vector in $\operatorname{dom}f$ is given, Algorithm 1 finds a global minimizer of $f$ in $O(Fn^{2}K_{1})$ time.

However, the minimization of a $M$ -convex function can be achieved in polynomial time.

Proposition 2 (murota2003discrete , Prop.10.4).

Assume that $\operatorname{dom}f$ is bounded. Let $F$ be the number of arithmetic operations needed to evaluate $f$ and $K_{\infty}=\max(||x-y||_{\infty}\mid x,y\in\operatorname{dom}f)$ . Then, if a vector in $\operatorname{dom}f$ is given, a global minimizer of $f$ can be found in $O(Fn^{3}\log^{2}(K_{\infty}/n))$ time.

The different algorithms developed by Murota (murota2003discrete, , Section 10.1) provide a minimizer of a $M$ -convex function in polynomial time, if an initial point is given and if the domain of the function is bounded. Whereas it is trivial to find a vector of $\mathbb{Z}^{n}$ such that $\forall_{i},N_{i}\leq N_{i}^{C}$ or a vector $N$ belonging to $\sum_{k}F_{k}$ , it is not obvious to find one satisfying both conditions. In fact, such a point can be obtained by solving the minimization problem:

[TABLE]

The condition $N\in B$ is equivalent to $N\in\operatorname*{\arg\,\min}_{N\in\sum_{k}F_{k}}\sum_{i}\max(N_{i}-N_{i}^{C},0)$ if $B$ is non-empty. The function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)$ is separable convex. Then, the function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)+\chi_{\sum_{k}F_{k}}$ is $M$ -convex according to (murota2003discrete, , Th. 6.13.(4), p.148). Because $\sum_{k}F_{k}$ is bounded and a point in $\sum_{k}F_{k}$ can be obtained in $O(Kn)$ operations by summing vectors taken in each set $F_{k}$ , it is possible to find a point $N^{0}\in\operatorname*{\arg\,\min}_{N\in\sum_{k}F_{k}}\sum_{i}\max(N_{i}-N_{i}^{C},0)=B$ in polynomial time, by Proposition 2.

We can finally write the following result about the complexity of the decomposition method given by Theorem 3.1.

Theorem 4.2.

Let $R=\sum_{k}R_{k}$ , for every $k\in\left[K\right]$ , $n_{k}=n-\#\mathcal{J}_{k}$ and $\overline{R}=\sum_{k}R_{k}(n_{k}-R_{k})$ . An optimal solution of Problem 5 can be obtained in $O((Kn)^{3.5}Ln^{3}\log^{2}(K/n)+(n+\overline{R})^{3.5}L)$ arithmetic operations, where $L$ is the input size of the bilevel problem.

Proof.

The first step of Theorem 3.1 is a maximization of a $M$ -concave function over a bounded domain $B$ . Finding a point in $B$ can be done by solving the $M$ -convex minimization problem:

[TABLE]

The domain of the function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)+\chi_{\sum_{k}F_{k}}$ is $\sum_{k}F_{k}$ . We define $K^{1}_{\infty}$ by:

[TABLE]

For every $N\in\sum_{k}N_{k}$ , the entries of $N$ are sum of $K$ binary values. Then, $K^{1}_{\infty}\leq K$ . We have to estimate the number of operations $F^{1}$ needed to evaluate the function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)+\chi_{\sum_{k}F_{k}}$ . The function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)$ can be evaluated in $O(n)$ operations. As a consequence of Lemma 6 and Lemma 7, $\sum_{k}F_{k}=(\sum_{k}\Delta_{k})\cap\mathbb{Z}^{n}$ . Hence, for any vector $N$ , the conditions $N\in\sum_{k}F_{k}$ is equivalent to $N\in(\sum_{k}\Delta_{k})\cap\mathbb{Z}^{n}$ . A vector $N$ belongs to $\sum_{k}\Delta_{k}$ if there exists for every $k\in\left[K\right]$ a vector $u_{k}\in\Delta_{k}$ such that $\sum_{k}u_{k}=N$ . Hence, to know whether $N$ belongs to $\sum_{k}\Delta_{k}$ or not is a linear feasibility problem in dimension $Kn$ , It can be solved in $O((Kn)^{3.5}L)$ arithmetic operations by an interior point method (renegar1988polynomial ). Here $L$ is the input size of the linear program. Consequently, $F^{1}=O((Kn)^{3.5}L)$ , and a point in $B$ can be obtained in $O((Kn)^{3.5}Ln^{3}\log^{2}(K/n))$ by Theorem 2.

After obtaining a point in $B$ , the first step of Theorem 3.1 consists in solving the $M$ -concave maximization problem:

[TABLE]

The domain of the function $N\mapsto\sum_{i}f_{i}(N_{i})-\chi_{B}(N)$ is bounded and equal to $B$ . We define $K^{2}_{\infty}$ by:

[TABLE]

For every $N\in\sum_{k}N_{k}$ , the entries of $N$ are sum of $K$ binary values. Then, for every $i\in\left[n\right]$ , we have $N_{i}\leq\min(K,N_{i}^{C})$ Then, $K^{2}_{\infty}\leq\min(K,\overline{N}^{C})$ , with $\overline{N}^{C}=\max_{i\in\left[n\right]}N_{i}^{C}$ . The number of operations $F^{2}$ needed to evaluate the function $N\mapsto\sum_{i}\max(N_{i}-N_{i}^{C},0)-\chi_{B}(N)$ is $O((Kn)^{3.5}L)$ like previously. Hence, a point $N^{*}\in\operatorname*{\arg\,\max}_{N\in B}\sum_{i}f_{i}(N_{i})$ can be obtained in $O((Kn)^{3.5}Ln^{3}\log^{2}(\min(K,\overline{N}^{C})/n))$ by Theorem 2.

According to the proof of Lemma 6, the second step of Theorem 3.1 is a linear program in dimension $Kn$ . In fact, we have:

[TABLE]

and the extreme points of the polyhedron $\Delta_{N}$ defined by:

[TABLE]

are integer. Hence, the second step of Theorem 3.1 can be solved in $O((Kn)^{3.5}L)$ arithmetic operations.

The third step of Theorem 3.1 is a linear program in $n$ variables. For some $u_{k}^{*}\in F_{k}$ , the constraints of this program are:

[TABLE]

For every $k\in\left[K\right]$ , the number of entries of $u_{k}^{*}$ equal to $1$ is $R_{k}$ , and the number of entries of $u_{k}^{*}$ equal to [math] and which do not belong to $\mathcal{J}_{k}$ is $n_{k}$ . Hence, the number of inequality constraints of this linear program is $\sum_{k}R_{k}(n_{k}-R_{k})=\overline{R}$ . Hence, a solution of this linear program can be found in $O((n+\overline{R})^{3.5}L)$ by interior-point methods. ∎

5 A faster algorithm for solving the bilevel problem

5.1 A polynomial time algorithm for the bilevel problem

Algorithm 1 can be applied to solve problem (6) of Theorem 3.1, that is maximizing the $M$ -concave function $f-\chi_{B}$ , or equivalently minimizing the $M$ -convex function $-f+\chi_{B}$ .

Step 1 consists in finding an initial vector $N\in B$ . As explained in Section 4, this can be done by solving a $M$ -convex minimization problem. Another approach consists in replacing the function $f-\chi_{B}$ by $g:N\mapsto f(N)-\chi_{\sum_{k}F_{k}}(N)-M\sum_{i}\max(N_{i}-N_{i}^{C},0)$ , where $M>0$ is an integer. If $N\in B$ , then $g(N)=f(N)$ . If $M$ is sufficiently large, then $M\sum_{i}\max(N_{i}-N_{i}^{C},0)\geq M$ if $N\notin B$ , and the maximum of the function $g$ is attained for $N\in B$ . Moreover $N\mapsto M\sum_{i}\max(N_{i}-N_{i}^{C},0)$ is separable convex, then $g$ is $M$ -concave according to (murota2003discrete, , Th. 6.13.(4), p.148). Then, both problems $\max_{N\in B}f(N)$ and $\max_{N\in\mathbb{Z}^{n}}g(N)$ are equivalent, and we can apply Algorithm 1 to solve the problem $\max_{N\in\mathbb{Z}^{n}}g(N)$ . An initial point is obtained by taking any point in $\sum_{k}F_{k}$ .

We need first part is to determine the number $F$ of operations to evaluate $g$ . Because the different functions $f_{i}$ are known, we have to determine the number of operations to decide whether a vector $N$ belongs to $\sum_{k}F_{k}$ or not. More precisely, the different evaluations of $f-\chi_{B}$ are done in Step 2. Hence, the question is the following: given a vector $N\in\sum_{k}{F_{k}}$ , how many operations are needed to check whether $N-e_{i}+e_{j}$ (for $i,j\in\left[n\right]$ ) belongs to $\sum_{k}{F_{k}}$ . We next show that this problem can be studied as a shortest path problem in a graph. Consider $N\in\sum_{k}F_{k}$ and let us define $u_{k}^{*}\in F_{k}$ for $k\in\left[K\right]$ such that $\psi(N)=\sum_{k}\langle\rho_{k},u_{k}^{*}\rangle$ , that is an optimal decomposition of $N$ in Theorem 3.1. For each $k\in\left[K\right]$ and $\alpha,\beta\in\left[n\right]$ , we define by $w^{k}_{\alpha\beta}$ the following quantity: $w^{k}_{\alpha\beta}=\rho_{k}(\alpha)-\rho_{k}(\beta)$ if $u_{k}^{*}(\alpha)=1$ and $u_{k}^{*}(\beta)=0$ , and $w^{k}_{\alpha\beta}=+\infty$ otherwise. Then, we define for each $\alpha,\beta\in\left[n\right]$ , $w_{\alpha\beta}=\min_{k\in\left[K\right]}w^{k}_{\alpha\beta}$ . We consider the oriented valuated graph $G=(V,E)$ where the set of vertices $V=\left[n\right]$ and there is an oriented edge between each vertices $\alpha,\beta\in V$ of value $w_{\alpha\beta}$ .

Theorem 5.1.

Let $i,j\in\left[n\right]$ . Suppose that there exists a path in $G$ with finite valuation between the vertices $i,j\in V$ . Then $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ . Moreover, there are no negative cycles and there is a shortest path between $i$ and $j$ . Let $(\alpha_{u})_{0\leq u\leq p}$ be any sequence such that $\alpha_{0}=i$ , $\alpha_{p}=j$ and let $\alpha_{0}\rightarrow\alpha_{1}\dots\alpha_{p-1}\rightarrow\alpha_{p}$ be a shortest path between $i$ and $j$ . Let also $(k_{u})_{0\leq u\leq p-1}$ be any sequence such that $w^{k_{u}}_{\alpha_{u}\alpha_{u+1}}=w_{\alpha_{u}\alpha_{u+1}}$ for all $0\leq u\leq p-1$ . Let us finally define the vectors $v_{k}^{*}$ , $k\in\left[K\right]$ such that $v_{k_{u}}^{*}=u^{*}_{k_{u}}-e_{\alpha_{u}}+e_{\alpha_{u+1}}$ for each $0\leq u\leq p-1$ and $v_{k}^{*}=u_{k}^{*}$ for each $k\notin\{k_{0},\dots,k_{p-1}\}$ . Then, $\psi(N-e_{i}+e_{j})=\sum_{k}\langle\rho_{k},v_{k}^{*}\rangle$ .

Proof.

By Lemma 6 and 7, we know that $N-e_{i}+e_{j}\in\sum_{k}{F_{k}}$ if and only if there exists $K$ vectors $v_{k}^{*}$ such that $N-e_{i}+e_{j}=\sum_{k}{v_{k}^{*}}$ with each $v_{k}^{*}\in F_{k}$ and $\psi(N-e_{i}+e_{j})=-\sum_{k}{\langle\rho_{k},v_{k}^{*}\rangle}$ . We consider $\psi(N)=-\sum_{k}{\langle\rho_{k},u_{k}^{*}\rangle}$ with each $u_{k}^{*}\in F_{k}$ . Hence, $\psi(N-e_{i}+e_{j})-\psi(N)$ is equal to:

[TABLE]

We have $\sum_{k}(u_{k}^{*}-v_{k})=e_{i}-e_{j}$ . When $v_{k}$ describes $F_{k}$ , the possible $u_{k}^{*}-v_{k}$ are the vectors $x_{k}$ with the following properties:

[TABLE]

Hence, $\psi(N-e_{i}+e_{j})-\psi(N)=\sum_{k}\langle\rho_{k},x_{k}^{*}\rangle$ , where $x_{k}^{*}$ is such that $\#\{\alpha\mid x_{k}^{*}(\alpha)=1\}=\#\{\alpha\mid x_{k}^{*}(\alpha)=-1\}$ . Consequently, $\psi(N-e_{i}+e_{j})-\psi(N)$ can be written as a sum of $w^{k}_{\alpha\beta}$ for certain $\alpha,\beta$ . Because of the condition $\sum_{k}{u_{k}^{*}-v_{k}}=e_{i}-e_{j}$ , we have $\psi(N-e_{i}+e_{j})-\psi(N)=w_{\alpha_{0}\alpha_{1}}^{k_{0}}+w_{\alpha_{1}\alpha_{2}}^{k_{1}}+\dots+w_{\alpha_{p-1}\alpha_{p}}^{k_{p-1}}$ , with the notations introduced in Theorem 5.1.

Consider now the graph defined in Theorem 5.1. If there exists a path between $i$ and $j$ , then its value can be written $w_{\beta_{0}\beta_{1}}^{l_{0}}+w_{\beta_{1}\beta_{2}}^{l_{1}}+\dots+w_{\beta_{q-1}\beta_{q}}^{l_{q-1}}$ (with the convention $\beta_{0}=i$ and $\beta_{q}=j$ ). By defining $v_{k}=u_{k}^{*}$ if $k\notin\{l_{0},\dots,l_{q-1}\}$ and $v_{l_{u}}=u_{l_{u}}^{*}-e_{\beta_{u}}+e_{\beta_{u+1}}$ for $0\leq u\leq q-1$ , the value of the path is equal to $\sum_{k}\langle\rho_{k},u_{k}^{*}-v_{k}\rangle$ . Because $w_{\beta_{u}\beta_{u+1}}^{l_{u}}<+\infty$ , we have $u^{*}_{l_{u}}(\beta_{u})=1$ and $u^{*}_{l_{u}}(\beta_{u+1})=0$ . Then, each $v_{k}\in F_{k}$ . Consequently, the value $\min_{v_{k}\in F_{k}\text{ and }\sum_{k}{v_{k}=N-e_{i}+e_{j}}}\sum_{k}{\langle\rho_{k},u_{k}^{*}-v_{k}\rangle}$ is finite and $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ . Moreover, the value $\psi(N-e_{i}+e_{j})-\psi(N)$ corresponds to the minimal values of the path between $i$ and $j$ in $G$ , that is the shortest path. Hence, if the value of the shortest path is $\sum_{u=0}^{p-1}w^{l_{u}}_{\alpha_{u}\alpha_{u+1}}$ , we have $\psi(N-e_{i}+e_{j})-\psi(N)=\sum_{k}\langle\rho_{k},u_{k}^{*}-v_{k}^{*}\rangle$ , with $v_{k}^{*}$ defined as in the statement of Theorem 5.1. Moreover, we can prove that there exists no cycle with negative weight in this graph. Suppose that such a cycle exists. It can be written $w_{\gamma_{0}\gamma_{1}}^{l_{0}}+w_{\gamma_{1}\gamma_{2}}^{l_{1}}+\dots+w_{\gamma_{r}\gamma_{0}}^{l_{r}}<0$ . For all $i\in\{0\dots r\}$ , we have $u_{l_{i}}(\gamma_{i})=1$ and $u_{l_{i}}(\gamma_{i+1})=0$ . We consider for $k\in\left[K\right]$ the vectors $v_{k}$ defined by $v_{l_{i}}=u^{*}_{l_{i}}-e_{\gamma_{i}}+e_{\gamma_{i+1}}$ , and $v_{k}=u^{*}_{k}$ for $k\notin\{l_{0},\dots,l_{r}\}$ . We have $\sum_{k}{u^{*}_{k}-v_{k}}=0$ and so $\sum_{k}{\langle\rho_{k},u_{k}\rangle}=\sum_{k}{\langle\rho_{k},v_{k}\rangle}+w_{\alpha_{1}\alpha_{2}}^{k_{1}}+w_{\alpha_{2}\alpha_{3}}^{k_{2}}+\dots+w_{\alpha_{p}\alpha_{1}}^{k_{p}}<\sum_{k}{\langle\rho_{k},v_{k}\rangle}$ which refutes the optimality of the vectors $u^{*}_{k}$ in the definition of $\psi(N)$ . ∎

Example 2.

We consider the cell (a) of Figure 3. We build the graph associated to $N=(3,3,1)$ (see Figure 4).

Consider $N^{\prime}=N-e_{1}+e_{2}=(2,4,1)$ . The shortest path in $G$ is $1\rightarrow 2$ with $w_{12}=0=w_{12}^{1}$ . Then, according to Theorem 5.1, the optimal decomposition of $(2,4,1)$ is $v_{1}^{*}=(0,1,0)$ , $v_{2}^{*}=(1,0,1)$ , $v_{3}^{*}=(0,1,0)$ , $v_{4}^{*}=(1,1,0)$ and $v_{5}^{*}=(0,1,0)$ .

Thanks to Theorem 5.1, if we know that a vector $N$ belongs to $\sum_{k}F_{k}$ , it is possible to check whether a vector $N-e_{i}+e_{j}$ belongs to $F_{k}$ by checking if there exists a path between $i$ and $j$ in the graph $G=(V,E)$ . Generally, $G$ has $n$ vertices and $n^{2}$ edges. From each vertex $i\in V$ , it is possible to find if there exists a path between $i$ and $j$ by using a depth-first or breadth first search algorithm in $O(n^{2})$ operations. Consequently, the number of operations needed to evaluate $g$ is $O(n^{3})$ .

According to Theorem 5.1, by checking if $N-e_{i}+e_{j}\in B$ , we obtain the optimal decomposition of $N-e_{i}+e_{j}=\sum_{k}v_{k}^{*}$ such that $\psi(N-e_{i}+e_{j})=-\sum_{k}\langle\rho_{k},v_{k}^{*}\rangle$ by solving a shortest path problem between two vertices. This can be done in $O(n^{3})$ operations thanks to Ford-Bellman algorithm (bellman1958routing , ford1956network ), because the graph $G$ has $n$ vertices and at most $n^{2}$ edges. Hence, according to Theorem 3.1, it suffices to solve the bilevel problem 5 to solve the linear feasibility problem of Lemma 8. Moreover, this problem can also be viewed as a shortest path problem in $G$ , according to the following result.

Theorem 5.2.

Consider $K$ vectors $u_{k}^{*}\in F_{k}$ for each $k\in\left[K\right]$ such that, if we define $N=\sum_{k}u_{k}^{*}$ , we have $\psi(N)=-\sum_{k}\langle\rho_{k},u_{k}^{*}\rangle$ . Consider the graph $G$ associated to $N$ . Consider an index $s\in\left[n\right]$ . Let $M>0$ be any real scalar such that $M\geq n\max_{i,j\in\left[n\right]}w_{ij}$ and let us modify $G$ such that for all $t\in\left[n\right]$ with $t\neq s$ and $w_{st}=+\infty$ , we have $w_{st}=M$ . Let us define a vector $y^{*}\in\mathbb{R}^{n}$ by $y^{*}_{s}=0$ and for each $t\in\left[n\right]$ with $t\neq s$ , $y^{*}_{t}$ is the length of the shortest path between $s$ and $t$ in $G$ . Then, for $M$ sufficiently large and for each $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y^{*},u_{k}\rangle$ .

Proof.

According to Lemma 8, a vector $y\in\mathbb{R}^{n}$ is such that for every $k\in\left[K\right]$ ,

[TABLE]

if and only if the following inequalities are satisfied:

[TABLE]

Consider such a vector $y$ . Consider also the graph $G$ associated to $N$ The previous inequalities can be rewritten $\forall k\in\left[K\right],\forall i,j\in\left[n\right],\;y_{j}-y_{i}\leq w^{k}_{ij}$ , or equivalently : $\forall i,j\in\left[n\right],\;y_{j}-y_{i}\leq w_{ij}$ . For each $\delta\in\mathbb{R}$ , $y+\delta e$ is also a solution. Consequently, it is possible to fix a coordinate to [math]. Take a coordinate $s$ such that $y_{s}=0$ . Consider $M>0$ such that $M\geq n\max_{i,j}w_{ij}$ and modify the graph $G$ as in the statement of the theorem. Consider an elementary cycle (that is a cycle containing no smaller cycle) of the modified graph. The cycle has no more than $n-1$ edges. Suppose that exactly $q$ edges have a modified weight, with $0\leq q\leq n-1$ . If $q=0$ , then no edge has a modified weight, and this cycle is a cycle of $G$ . So, its weight is nonnegative. If $q\geq 1$ , then the total weight of the cycle is bigger than $qM+(n-1-q)\min_{i,j}w_{ij}\geq n(\max_{i,j}w_{ij}-\min_{i,j}w_{ij})\geq 0$ . Consequently, the modified graph has no negative cycles.

For each $t\in\left[n\right]$ , with $t\neq s$ , there exists a path between $s$ and $t$ . Let us define $y^{*}$ such that $y^{*}_{s}=0$ and for each $t\in\left[n\right]$ with $t\neq s$ , $y^{*}_{t}$ corresponds to the length of the shortest path between $s$ and $t$ . Consider $i,j\in\left[n\right]$ . Then $y^{*}_{i}+w_{ij}$ is the length of a path between $s$ and $j$ defined as the concatenation of the shortest path between $s$ and $i$ and the edge $i\rightarrow j$ . So $y^{*}_{i}+w_{ij}\geq y^{*}_{j}$ . Hence, according to Lemma 8, we have for each $k\in\left[K\right]$ , $u_{k}^{*}\in\arg\max_{u_{k}\in F_{k}}\langle\rho_{k}+y^{*},u_{k}^{*}\rangle$ . ∎

These different results lead to Algorithm 2 to solve the bilevel problem 5. First, we have to find an initial point $N$ in $\sum_{k}F_{k}$ , with its optimal decomposition $\sum_{k}u_{k}^{*}$ . We can calculate for each $k\in\left[K\right]$ and for each $i,j\notin\mathcal{J}_{k}$ the value $w^{k}_{ij}$ , store them, and then define the graph $G$ associated to $N$ . Hence, with a graph search algorithm, we know for each $i,j\in\left[n\right]$ whether $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ or not, and can calculate $g(N-e_{k}+e_{k})$ for each $k,l\in\left[n\right]$ and find $i,j\in\operatorname*{\arg\,\max}_{k,l}g(N-e_{k}+e_{l})$ . By finding the shortest path between $i$ and $j$ in $G$ , we obtain the optimal decomposition $N-e_{i}+e_{j}=\sum_{k}v_{k}^{*}$ . Like in Algorithm 1, if $g(N-e_{i}+e_{j})\leq g(N)$ , then $N^{*}=N$ is the maximum value of $g$ over $\sum_{k}F_{k}$ . Else, we take $N:=N-e_{i}+e_{j}$ . For all the indices $k$ such that $u_{k}^{*}\neq v_{k}^{*}$ , we evaluate the new value of $w_{ij}^{k}$ and we define the graph $G$ associated to $N-e_{i}+e_{j}$ and restart the algorithm. Notice that the number of indices $k$ such that $u_{k}^{*}\neq v_{k}^{*}$ is bounded by the length of the shortest path in $G$ ; it means that this number is less than $n$ . After finding the optimal $N^{*}$ and having its optimal decomposition $N^{*}=\sum_{k}u_{k}^{*}$ , we can redefine the graph associated to $N^{*}$ and return an optimal $y^{*}$ defines as in the statement of Theorem 5.2.

Algorithm 2 can be written as follows. We take in input a function GraphSearch, which associate to a graph $G$ (defined by the weight vector $w$ of its edges) a Boolean vector $b$ such that $b_{ij}=1$ if there is an edge between $i$ and $j$ and [math] otherwise. We also take a function ShortestPath, which associate to a graph $G$ (also defined by the weight vector $w$ ) and two vertices $i$ and $j$ , the value $v$ of the shortest path and a vector $path$ with the indices of this shortest path. Finally, we consider the function ShortestPath2, which associate to $w$ and a vertex $s$ a vector corresponding to the values of the shortest path between $s$ and all other vertices in $G$ . For much ease, we denote by $f^{*}$ the function $f^{*}:N\mapsto f(N)+M\sum_{i=1}^{n}\max(N_{i}-N^{C}_{i},0)$ .

Note that the pseudo-polynomial time bound for Murota’s greedy algorithm 1 given by Proposition 1 leads in this special case to a polynomial time bound, as explained in the following result.

Theorem 5.3.

Let us define $R=\sum_{k}R_{k}$ , for each $k\in\left[K\right]$ $n_{k}=n-\#\mathcal{J}_{k}$ (that is the number of possible non-zero entries of the vectors of $F_{k})$ and $\overline{R}=\sum_{k}R_{k}(n_{k}-R_{k})$ . Algorithm 2 returns a global optimizer with a time complexity of $O(R(n^{3}+\overline{R}))$ and a space complexity of $O(\overline{R})$ .

Proof.

The vector returned by the algorithm is a global optimizer according to Algorithm 1 and Theorem 5.1. The initialization consists in taking vectors in each $F_{k}$ and in adding them; it can be done in $O(K)$ operations. Then, to define the graph $G$ , we have to calculate $w_{ij}^{k}$ for each $i,j\notin\mathcal{J}_{k}$ and each $k\in\left[K\right]$ , and to store the values. Let us define for each $k\in\left[K\right]$ $n_{k}=n-\#\mathcal{J}_{k}$ . For each $k\in\left[K\right]$ , we have $R_{k}\leq n_{k}$ , and there are precisely $R_{k}$ coordinates of $u_{k}^{*}$ equal to 1 for each $u_{k}^{*}\in F_{k}$ . Then, for each $k\in\left[K\right]$ , there are exactly $R_{k}(n_{k}-R_{k})$ finite values of $w_{ij}^{k}$ to store. Then, by defining $\overline{R}=\sum_{k}R_{k}(n_{k}-R_{k})$ , we need $O(\overline{R})$ operations to define $w_{ij}$ and $k_{ij}$ . The function $GraphSearch$ needs $O(n^{3})$ operations by a depth-first or breadth-first algorithm to know if there is a path between $i$ and $j$ . The function $ShortestPath$ needs also $O(n^{3})$ operations to calculate the shortest path between $i$ and $j$ with Ford-Bellman algorithm. The length of the path is bounded by $n$ . Consequently, there is less than $n$ vectors $u_{k}^{*}$ which have to be updated; and then less than $2nn_{k}$ values $w_{\alpha\beta}^{k}$ to update. $\overline{R}$ operations are needed to calculate the new values of $w_{ij}$ and $k_{ij}$ . So, the number of operations in each step of the "while" loop is $O(n^{3}+n\overline{R})$ . The number of iterations of the loop is the same as in Algorithm 1, and is bounded by $K_{1}$ where $K_{1}=\max(||x-y||_{1},x,y\in\sum_{k}F_{k})$ . For each $x,y\in\sum_{k}F_{k}$ , we have:

[TABLE]

by defining $R=\sum_{k=1}^{K}R_{k}$ . Finally, to find the optimal $y^{*}$ , $n^{2}$ operations are needed to find $M$ , and $O(n^{3})$ operations are needed to evaluate the function $ShortestPath2$ by using again the Ford-Bellman algorithm. Step 7 consists in calculating the shortest path between a vertex $s$ and the other ones in a graph with $n$ vertices and $n^{2}$ edges. Then, Step 7 can be obtained in $O(n^{3})$ thanks to Ford-Bellman algorithm. Hence, the global time complexity of Algorithm 2 is $O(R(n^{3}+\overline{R}))$ and space complexity is $O(\overline{R})$ . ∎

Notice that for each $k\in\left[K\right]$ , $n_{k}\leq n$ and $1\leq R_{k}\leq n_{k}$ . Then $K\leq R\leq nK$ and $0\leq\overline{R}\leq Kn^{2}$ . Therefore, the time complexity of Algorithm 2 is $O(Kn^{3}(K+n))$ in the worst case, whereas the space complexity is $O(Kn^{2})$ .

Example 3.

Consider again Example 1 together with the concave function $f$ defined by

[TABLE]

We suppose that $\forall k,\mathcal{J}_{k}=\emptyset$ . Hence, we can prove that $\sum_{k}{F_{k}}=\{N\in\mathbb{N}^{3}|\sum_{i=1}^{3}{N_{i}}=7$ and $\max(N_{i})\leq 5\}$ . First, we want to solve $\max_{N\in\sum_{k}{F_{k}}}{-(N_{1}^{2}+N_{2}^{2}+N_{3}^{2})}$ . We start from $N^{(0)}=(5,2,0)$ , a feasible point. Following Algorithm 1, we compute $N^{(1)}=(4,2,1)$ and $N^{(2)}=(3,2,2)$ which is a minimizer. We take $N^{*}=(3,2,2)$ . Now, we solve $\max_{\scriptstyle u_{1}\in F_{1},\dots,u_{5}\in F_{5},\sum_{k=1}^{5}{u_{k}}=N^{*}}{\sum_{k}{\langle\rho_{k},u_{k}\rangle}}$ . We obtain $u_{1}^{*}=\left[1,0,0\right]$ , $u_{2}^{*}=\left[1,0,1\right]$ , $u_{3}^{*}=\left[0,1,0\right]$ , $u_{4}^{*}=\left[1,0,1\right]$ , $u_{5}^{*}=\left[0,1,0\right]$ . Applying Lemma 8, we obtain the linear inequalities $y_{1}^{*}-y_{2}^{*}\leq 3/2$ , $0\leq y_{1}^{*}-y_{3}^{*}$ and $-1\leq y_{2}^{*}-y_{3}^{*}\leq-1/2$ . In particular, $y^{*}=(3/4,0,3/4)$ is an optimal solution.

5.2 A particular case : theory of majorization

Algorithm 2 can be accelerated in the particular case $\forall k\in\left[K\right],\;\mathcal{J}_{k}=\emptyset$ , that is $F_{k}=\left\{u_{k}\in\{0;1\}^{n}|\sum_{i=1}^{n}u_{k}(i)=R_{k}\right\}$ .

As previously, an important step of the maximization of the function $g$ consists in being able to know whether a point belongs to $\sum_{k}F_{k}$ or not. In this particular case, we can use the majorization order olkin1979inequalities . For every $x\in\mathbb{R}^{n}$ , denote by $x_{[1]}\geq\cdots\geq x_{[n]}$ the coordinates of $x$ arranged in nonincreasing order. A vector $x\in\mathbb{R}^{n}$ is said to be majorized by another vector $y\in\mathbb{R}^{n}$ , denoted $x\prec y$ , if $\sum_{i=1}^{n}{x_{i}}=\sum_{i=1}^{n}{y_{i}}$ and $\forall 1\leq k\leq n-1$ , $\sum_{i=1}^{k}{x_{[i]}}\leq\sum_{i=1}^{k}{y_{[i]}}$ .

We have the following result.

Theorem 5.4 (Gale-Ryser , see (olkin1979inequalities, , Th. 7.C.1)).

Let $a\in\mathbb{N}^{k}$ and $b\in\mathbb{N}^{n}$ be two integer vectors with nonnegative values. Let $a^{*}\in\mathbb{N}^{n}$ defined by $a^{*}_{i}=\#{j\mid a_{j}\geq i}$ . Then, the following assertions are equivalent:

$b\prec a^{*}$ ** 2. 2.

There exists a matrix $U\in\mathcal{k,n}(\mathbb{Z})$ such that for each $i,j$ , $u_{ij}\in\{0;1\}$ , $\forall 1\leq i\leq k,\;\sum_{j=1}^{n}u_{ij}=a_{i}$ and $\forall 1\leq j\leq n,\;\sum_{i=1}^{k}u_{ij}=b_{j}$

Corollary 3.

Denoting by $f_{r}=(1,\dots,1,0,\dots,0)$ the vector with exactly $r$ $1$ and by $p_{r}=\#\{k|R_{k}=r\}$ , for $1\leq r\leq n$ , we have $\sum_{k}{F_{k}}=\{N\in\mathbb{N}^{n}|N\prec\sum_{r=1}^{n}{p_{r}f_{r}}\}$ .

Proof.

A vector $N$ belongs to $\sum_{k}F_{k}$ if and only if for each $i\in\left[n\right]$ , $N_{i}$ corresponds to the sum of the coefficients of the $i$ -th column of a matrix of size $K\times n$ with coefficients in $\{0;1\}$ and such that the sum of the coefficients of the $k$ -th line is $R_{k}$ . We conclude by 5.4. ∎

Example 4.

Consider Example 1. We have $p_{1}=3$ , $p_{2}=2$ and $p_{3}=0$ . So $N$ is feasible iff $N$ verifies $N\prec(5,2,0)$ .

Like for Algorithm 2, we need to know for a given $N\in\sum_{k}F_{k}$ whether $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ for each $i,j\in\left[n\right]$ . It is possible to answer to this question in polynomial time in $n$ by sorting $N-e_{i}+e_{j}$ for each $i,j$ and by checking the condition $N-e_{i}+e_{j}\prec N^{\max}$ . The time complexity of such a procedure is $O(n^{3}\log(n))$ . However, it can be accelerated thanks to the following result.

Lemma 10.

Let $N\in\sum_{k}F_{k}$ , and $i,j\in\left[n\right]$ . Let $S$ be the function defined on $\mathbb{R}^{n}\times\left[n\right]$ such that $\forall x\in\mathbb{R}^{n},\forall k\in\left[n\right]$ , $S(x,k)$ is the sum of the $k$ largest values of the coordinates of $x$ . Suppose finally that $N_{j}$ is the $k_{j}$ -th largest value of the coordinates of $N$ (if $k_{j}>1$ , then we suppose that the $k_{j}-1$ -th largest value of $N$ is strictly bigger than $N_{j}$ ), and that $N_{i}$ is the $k_{i}$ -th largest value of the coordinates of $N$ (if $k_{i}<n$ , then we suppose that the $k_{i}+1$ -th largest value of $N$ is strictly smaller than $N_{j}$ ). Then $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ if and only if $N_{i}>0$ and, either $N_{i}>N_{j}$ or $\forall k_{j}\leq k\leq k_{i},S(N,k)<S(N^{\max},k)$ .

Proof.

Suppose $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ . Then $N_{i}-1\geq 0$ and $N_{i}>0$ . Moreover, suppose $N_{i}\leq N_{j}$ . Then, $N_{i}-1<N_{j}+1$ and $S(N,k)=S(N,k)+1$ . Then, $S(N,k)<S(N,k)+1=S(N^{\max},k)$ .

Conversely, if $N_{i}>0$ , then all the coordinates of $N-e_{i}+e_{j}$ are nonnegative integers. If $N_{i}>N_{j}$ , then we easily see that $N-e_{i}+e_{j}\prec N$ . So $N-e_{i}+e_{j}\prec N^{\max}$ and $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ . Suppose that $N_{i}\leq N_{j}$ . Because we suppose that the $k-1$ -th largest value of $N$ is strictly bigger than $N_{j}$ , then $k_{i}>k_{j}$ . We also suppose that $\forall k_{j}\leq k\leq k_{i},S(N,k)<S(N^{\max},k)$ . The $k-1$ -th largest value of $N$ is strictly bigger than $N_{j}$ , so it is bigger than $N_{j}+1$ . Consequently, we have for all $1\leq l\leq k-1$ , $S(N-e_{i}+e_{j},l)=S(N,l)\leq S(N^{\max},l)$ (because $N\prec N^{\max}$ ). Moreover, $\forall k_{j}\leq k\leq k_{i}-1,S(N,k)<S(N^{\max},k)$ . Because the $k_{i}+1$ -th larger coordinate of $N$ is strictly smaller than $N_{i}$ , then it is smaller than $N_{i}+1$ and we have $S(N-e_{+}e_{j},k_{i})=S(N,k_{i})\leq S(N^{\max},k_{i})$ and $\forall l\geq k_{i}+1$ , $S(N-e_{+}e_{j},l)=S(N,l)\leq S(N^{\max},l)$ . Hence, $N-e_{i}+e_{j}\prec N^{\max}$ and $N-e_{i}+e_{j}\in\sum_{k}F_{k}$ . ∎

To solve the bilevel problem 5 in this specific case, we need to find $u_{1}^{*}\in F_{1},\dots,u_{K}^{*}\in F_{K}$ such that $\psi(N^{*})=-\sum_{k}\langle\rho_{k},u_{k}^{*}\rangle$ . In Algorithm 2, such vectors $(u_{k})^{*}$ are found in the same time as $N^{*}$ . Then, to accelerate Algorithm 2, we need to be able to solve this problem rapidly. In particular, to use a classical linear programming approach leads to a $O((Kn)^{3,5})$ time complexity, which is not acceptable. The problem to solve can be written:

Problem 6.

[TABLE]

We already mentioned in the proof of Theorem 5.4 that the constraints of this linear program can be written $0\leq u\leq 1,\;Au=b$ , where $A$ is a totally unimodular matrix. Therefore, the value of this problem is equal to the value of its continuous relaxation. Moreover, it can be interpreted as a minimum cost flow problem (see (schrijver2003combinatorial, , Ch. 12) for background). We define a bipartite graphs with vertices $i\in\left[n\right]$ and $k\in\left[K\right]$ , and edges between each $i\in\left[n\right]$ and each $k\in\left[K\right]$ . Each vertex $i\in\left[n\right]$ has an incoming flow equal to $N_{i}$ , whereas each vertex $k\in\left[K\right]$ has an outgoing flow equal to $R_{k}$ . Moreover, the capacity of each edge is $1$ , meaning that each flow $u_{k}(i)$ satisfies $0\leq u_{k}(i)\leq 1$ , and a cost $-\rho_{k}(i)$ is associated to each edge. Hence, the problem consists in finding the flow $u$ minimizing the total cost in this graph. Plenty of algorithms exist to solve such a problem. In our case, we have $K\gg n$ . According to Theorem 5.3, Algorithm 2 needs $O(Rn^{2}(K+n))$ operations to solve Problem 5. Notice that $K\leq R\leq nK$ . Therefore, in order to accelerate Algorithm 2 in the studied case, we need an algorithm solving the flow problem with a complexity depending on $K$ in $K^{\alpha}$ with $\alpha<2$ .

We can interpret the minimum cost flow problem as a minimum cost circulation problem, as presented in (schrijver2003combinatorial, , Ch. 12). We introduce a sink $t$ . We define an edge between each $k\in\left[K\right]$ and $t$ of cost equal to [math], with a lower-bound for the flow equal to $R_{k}$ and a capacity of $R_{k}$ . We also define an edge between $t$ and each $i\in\left[n\right]$ of cost equal to [math], with a lower-bound for the flow equal to $N^{*}_{i}$ and a capacity of $N^{*}_{i}$ . Such a graph is represented on Figure 5.

Such a graph has $|V|=K+n+1$ vertices and $|E|=Kn+K+n$ edges. The sum of the capacities of the different edges is $2R+Kn$ . In (gabow1989faster, , Sec. 3.3), an algorithm is proposed to solve such a problem. Different complexity bounds of such an algorithm are given in (gabow1989faster, , Th. 3.5). In the case $K\gg n$ , the optimal vectors $u_{1}^{*},\dots,u_{K}^{*}$ can be found in $O((Kn)^{3/2}\log((K+n)||\rho||_{\infty}))$ .

We can now write an algorithm for solving the bilevel problem in this specific case. We need first to calculate $N^{\max}=\sum_{r=0}^{n}p_{r}f_{r}$ , where $p_{r}$ is defined as in the statement of Theorem 5.4, and to find an initial point $N\in\sum_{k}F_{k}$ . We apply the same method as in Algorithm 1. In order to calculate $g(N-e_{i}+e_{j})$ for each $i,j\in\left[n\right]$ , we sort the coordinate of $N$ in the decreasing order, and we use Lemma 10 to decide whether $N-e_{i}+e_{j}\in F_{k}$ for all $i,j$ . We use the same loop as in Algorithm 1 to compute an $N^{*}$ such that $g(N^{*})$ is the maximal value of $g$ over $\sum_{k}F_{k}$ . Then, we solve the minimum cost flow problem 6, as described previously, to find the optimal $u_{k}^{*}$ and then we use Lemma 5.2 to determine an optimal $y^{*}$ . It leads to Algorithm 3. The function $Sort$ associates to a vector $x\in\mathbb{R}^{n}$ a couple $(y,ind)$ , where $y$ is a permutation of $x$ such that $y_{1}\geq\dots\geq y_{n}$ and $ind$ is such that $x_{i}=y_{ind(i)}$ for each $i\in\left[n\right]$ . The function $S$ is defined by $S(x,k)=\sum_{i=1}^{n}x_{i}$ . The function $MinCostFlow$ associates to the different vectors $(\rho_{k})_{k\in\left[K\right]}$ the vectors $(u_{k}^{*})_{k\in\left[K\right]}$ solving the minimum cost flow problem 6. The functions $f^{*}$ and $ShortestPath2$ are defined as for Algorithm 2.

Theorem 5.5.

Let us define $||\rho||_{\infty}=\max_{k\in\left[K\right],i\in\left[n\right]}|\rho_{k}(i)|$ , $R=\sum_{k}R_{k}$ , for each $k\in\left[K\right]$ $n_{k}=n-\#\mathcal{J}_{k}$ and $\overline{R}=\sum_{k}R_{k}(n_{k}-R_{k})$ . Algorithm 3 is correct and returns a global optimizer in $O(Rn^{2}+(Kn)^{3/2}\log((K+n)||\rho||_{\infty})+\overline{R}+n^{3})$ time and $O(Kn+n^{2})$ space.

Proof.

According to Theorem 3.1, Theorem 5.4, Lemma 10 and Algorithm 1, this algorithm returns an optimal solution $N^{*}$ of the high-level problem and an optimal discount vector $y^{*}$ . Similarly as in the proof of Algorithm 2, the number of calls of the "while" loop is bounded by $R$ . The function $Sort$ needs $O(n\log(n))$ time and space operations. $O(n^{2})$ operations are needed to evaluate the vector $b$ , then the global time complexity of the "while" loop is $O(Rn^{2})$ whereas the space complexity is $O(n^{2})$ . Then, the optimal vectors $u_{1}^{*},\dots,u_{K}^{*}$ can be obtained in $O((Kn)^{3/2}\log((K+n)||\rho||_{\infty}))$ time and $O(Kn)$ space. By calculating only the finite values of $w_{ij}^{k}$ (which are not necessary stored here), the number of operations needed to determine each $w_{ij}$ and $k_{ij}$ is $O(\overline{R})$ , with $\overline{R}=\sum_{k}R_{k}(n_{k}-R_{k})$ and for each $k\in\left[K\right]$ , $n_{k}=n_{\#}\mathcal{J}_{k}$ . We need only $O(n^{2})$ space to store the values $w_{ij}$ and $k_{ij}$ . Finally, the vector $y^{*}$ can be found by using the Ford-Bellman algorithm in a graph of $n$ vertices and $n^{2}$ edges, that is in time complexity of $O(n^{3})$ . ∎

In the worst case, we have $R=Kn$ and $\overline{R}=Kn^{2}$ . Then, the time complexity of Algorithm 3 is $O(Kn^{3}+(Kn)^{3/2}\log((K+n)||\rho||_{\infty}))$ If the number of bits needed to write $||\rho||_{\infty}$ is polynomial in $n$ and if $K\gg n$ , then Algorithm 3 is faster than Algorithm 2. We finally notice that a minimum cost flow problem is strongly polynomial time solvable, and it is then possible to adapt Algorithm 3 to return an optimal $y^{*}$ in strongly polynomial time. However, Algorithm 3 does not go faster than Algorithm 2 in this case.

6 The general algorithm

In this section, we come back to the general bilevel problem 2 proposed in Section 2, and extend Algorithm 2 to it. In the low level problem of each customer, the consumptions for different contents verify the constraints $\forall a\in\left[A\right],\sum_{t=1}^{T}{u_{k}^{a}(t)}=R_{k}^{a}$ , $\forall t\in\mathcal{I}_{k}^{a},a\in\left[A\right],u_{k}^{a}(t)=0$ and $\forall t\in\left[T\right],\sum_{a\in\left[A\right]}{u_{k}^{a}(t)}\leq 1$ . We make the assumption that for each customer $k$ , the sets of possible instants at which this customer makes a request for the different applications are disjoint, meaning that for any two applications $a\neq a^{\prime}$ , the complements of $\mathcal{I}_{k}^{a}$ and $\mathcal{I}_{k}^{a^{\prime}}$ in $\left[T\right]$ have an empty intersection. Then the constraint $\forall t\in\left[T\right],\sum_{a\in\left[A\right]}{u_{k}^{a}(t)}\leq 1$ is automatically verified and the low-level problem of each customer can be separated into different optimization problems corresponding to the consumption vector $u_{k}^{a}$ of each customer $k$ for each application $a$ . Each of these problems takes the following form:

Problem 7.

[TABLE]

We denote by $F_{k}^{a}$ the feasible set of this problem. The above assumption (that the complements of $\mathcal{I}_{k}^{a}$ and $\mathcal{I}_{k}^{a^{\prime}}$ have an empty intersection) is relevant in particular if only one kind of application is sensitive to price incentives. For instance, requests for downloading data can be anticipated (see tadrous2013pricing ) and it makes sense to assume that customers are only sensitive to incentives for this kind of contents. In this case, the assumption means that customers wanting to download data can shift their consumption only at instants when they do not request another kind of content.

Moreover, under this assumption, the decomposition theorem is still valid and Problem 2 can be solved with the following method:

Theorem 6.1 (Decomposition (general case)).

The bilevel problem 2 can be solved as follows:

Find an optimal solution $(N^{a,b})^{*}$ to the high level problem with unknown $N^{a,b}$ for each $a\in\left[A\right]$ , $b\in\left[B\right]$ :

Problem 8.

[TABLE] 2. 2.

For each $a\in\left[A\right]$ and $b\in\left[B\right]$ , find vectors $((u_{k}^{a})^{*})_{k\in\mathcal{K}_{b}}$ solutions of the following problem:

[TABLE] 3. 3.

Find for each $a\in\left[A\right]$ and $b\in\left[B\right]$ a vector $y_{a,b}^{*}$ such that $\forall k\in\mathcal{K}_{b}$ ,

[TABLE]

.

Proof.

The different problems corresponding for each $a\in\left[A\right]$ , for each $b\in\left[B\right]$ and for each $k\in\mathcal{K}_{b}$ to Problem 7 are independent. Thus, according to Lemma 6, the global bilevel program consists in solving Problem 8. Moreover, the optimal decomposition of $(N^{a,b})^{*}$ and the optimal price vector $(y^{a,b})^{*}$ are totally independent for each $a\in\left[A\right]$ and $b\in\left[B\right]$ . Then, the proof of the last two parts in the theorem is the same as in Theorem 3.1. ∎

The last two parts of Theorem 6.1 are independent for each $a\in\left[A\right]$ and $b\in\left[B\right]$ . Thus, they can be solved similarly as in the case of one kind of application and one kind of contracts, studied in Section 3. We need to solve Problem 8. The function to optimize is separable (it can be written as a sum of function depending only of one coordinate), but these functions are not concave in $(N^{1,1},\dots,N^{A,B})\in\mathbb{R}^{nAB}$ . However, because each function $s_{l}^{a,b}$ is concave nonincreasing and each $N^{a,b}(t,l)$ is positive, we notice that $\forall a^{\prime}\in\left[A\right],b^{\prime}\in\left[B\right]$ , the function which sends $N^{a^{\prime},b^{\prime}}(t,l)$ to $\sum_{a\in\left[A\right]}{\sum_{b\in\left[B\right]}{\gamma_{b}N^{a,b}(t,l)s_{l}^{a,b}(N(t,l))}}$ is still concave. Consequently, the function to optimize in Problem 8 is $M$ -concave in each vector $N^{a,b}\in\mathbb{Z}^{T\times L}$ considered separately, the other one being fixed. This leads to a block descent method, in which we use the same scheme as in Algorithm 1, successively, to maximize the objective function over every vector $N^{a,b}$ . We denote by $f(N^{1,1},\dots,N^{A,B})$ the objective function of the high-level problem. We consider for each $a,b$ a vector $N^{a,b}\in\sum_{k\in\mathcal{K}_{b}}F_{k}^{a}$ . For each couple $(a,b)$ taken successively, we find $(i^{a,b},j^{a,b})$ belonging to:

[TABLE]

If $f(N^{1,1}-e_{i^{1,1}}+e_{j^{1,1}},\dots,N^{A,B}-e_{i^{A,B}}+e_{j^{A,B}})\leq f(N^{1,1},\dots,N^{A,B})$ , then the algorithm stops and returns $(N^{1,1},\dots,N^{A,B})$ . Otherwise, we take for each $a,b$ , $N^{a,b}:=N^{a,b}-e_{i^{a,b}}+e_{j^{a,b}}$ and begin again. Consequently, Algorithm 2 can be modified to solve the bilevel problem 6 in the general case. It leads to Algorithm 4. The function $GraphSearch$ , $ShortestPath$ and $ShortestPath2$ are the same as for Algorithm 2. The function $f^{*}$ is here defined by:

[TABLE]

with $N(t,l)=\sum_{a\in\left[A\right]}\sum_{b\in\left[B\right]}N^{a,b}(t,l)$ .

Because the objective function of Problem 8 is not $M$ -convex in $(N^{1,1},\dots,N^{A,B})$ , we have no guarantee of convergence of Algorithm 4 to a global optimal of the function $f^{*}$ . However, we can characterize the nature of the optimum returned by Algorithm 4. In order to estimate the complexity of Algorithm 4, we define the function $\Delta f^{*}$ by:

[TABLE]

If for each $a,b$ we have $u^{a,b}=v^{a,b}$ , then $\Delta f^{*}(N^{1,1},\dots,N^{A,B})=0$ . Thus, we have

[TABLE]

. Because the set $\prod_{a,b}(\sum_{k\in\mathcal{K}_{b}}F_{k}^{a})$ is finite, we can define the value $\delta g$ by:

[TABLE]

because $f^{*}$ has not a constant value.

Theorem 6.2.

Let us define $\gamma_{\max}=\max_{b\in\left[B\right]}\gamma_{b}$ . Let us also define $R=\sum_{a\in\left[A\right]}\sum_{k\in\left[K\right]}R_{k}^{a}$ , for each $a\in\left[A\right]$ and $k\in\left[K\right]$ $n_{k}^{a}=TL-\#\mathcal{J}_{k}^{a}$ (that is the number of possible non-zero coordinates of the vectors of $F_{k}^{a})$ and $\overline{R}=\sum_{a}\sum_{k}R_{k}^{a}(n_{k}^{a}-R_{k}^{a})$ . Algorithm 4 terminates in $O(\frac{\gamma_{\max}R}{\delta g}(AB(TL)^{3}+\overline{R}))$ time and $O(\overline{R})$ space, and returns vectors $(y^{a,b})^{*}_{a\in\left[A\right],b\in\left[B\right]}$ and $(N^{a,b})^{*}_{a\in\left[A\right],b\in\left[B\right]}$ such that $\forall a\in\left[A\right],b\in\left[B\right],\;\forall N^{a,b}\in\sum_{k}{\mathcal{K}_{b}}F_{k}^{a}$ :

[TABLE]

Proof.

Algorithm 4 continues while the value $g^{*}$ is strictly larger than $g_{N}$ . Because the set $\prod_{a,b}(\sum_{k\in\mathcal{K}_{b}}F_{k}^{a})$ is finite, the algorithm terminates. When it stops, the vector $(N^{a,b})_{a\in\left[A\right],b\in\left[B\right]}$ is such that $\forall a\in\left[A\right],b\in\left[B\right],\;\forall u,v\in\left[T\right]\times\left[L\right]$ :

[TABLE]

For each $a,b$ , the function $N^{a,b}\mapsto f(N^{1,1},\dots,N^{a,b},\dots,N^{A,B})$ is $M$ -concave. The statement of the theorem comes straightforwardly from the equivalence between local and global optimality for $M$ -concave functions.

Algorithm 4 differs from Algorithm 2 by the different applications and kind of contracts and by the number of iterations of the loop. The set $\left[K\right]$ of customers is split following the different kind of contracts $b\in\left[B\right]$ . Thus, we have to define the parameters $w_{ij}^{k,a}$ for each $k\in\left[K\right]$ and $a\in\left[A\right]$ and the global space complexity becomes $\sum_{a}\sum_{k}R_{k}^{a}(n_{k}^{a}-R_{k}^{a})=\overline{R}$ . The number of iterations of the loop can be estimated with a pseudo-polynomial bound. The algorithm continues while $g^{*}>g_{N}$ . Then, the new value of $g^{*}$ is $f^{*}(N^{1,1}-e_{i^{1,1}}+e_{j^{1,1}},\dots,N^{A,B}-e_{i^{A,B}}+e_{j^{A,B}}$ . Consequently, at each iteration of the loop, the value of $g^{*}$ increases of at least $\delta g$ until the algorithm stops. The finite values of $f^{*}$ are nonnegative, and an upper bound is $(\max_{b\in\mathcal{B}}\gamma_{b})(\sum_{a}\sum_{k\in\left[K\right]}R_{k}^{a})=\gamma_{\max}R$ because each function $s_{l}$ takes values between [math] and $1$ . In each loop, the number of operations is $O(\overline{R}+AB(TL)^{3})$ to calculate the new values of $w_{ij}^{a,b}$ and to solve a shortest path problem for each $a$ and $b$ in the graph $G^{a,b}$ with nodes corresponding to all couples in $\left[T\right]\times\left[L\right]$ and edges with values $w^{a,b}_{ij}$ between vertices $i,j\in\left[T\right]\times\left[L\right]$ . ∎

7 Experimental results

We consider an application based on real data provided by Orange. It involves the data consumptions in an area of $L=43$ cells, during one day divided in time slots of one hour, that is $T=24$ time slots. We will focus here our study on price incentives only for download contents. During this day, a number $K$ of more than $2500$ customers make some requests for downloading data in this area and we are interested in balancing the number of active customers in the network. Even though they are insensitive to price incentives, other kind of requests (web, mail, etc.) have to be satisfied and they are taken into account in the high level optimization problem. We consider two classes of users: standard and premium customers. The premium ones demand a better quality of service. Hence, they are less satisfied than the standard customers if they share their cell with a given number of active customers. We therefore define the satisfaction function as in Section 2. The provider wants to favor the premium customers. Hence, we take $\gamma_{b}=2$ for the latter ones and $\gamma_{b}=1$ for the standard customers, in the high-level optimization problem. We also assume that the premium customers are less sensitive to the incentives, and thus take $\alpha_{k}^{a}=1/2$ for all standard customers and $\alpha_{k}^{a}=1$ for all premium customers in the low-level problem 1. We estimate very simply the parameters $\rho_{k}$ . We take $\rho_{k}(t)=1$ when the customer $k$ consumes download at time $t$ without incentives, $\rho_{k}(t)=0$ when he does not make any request without incentives but makes a request for download at times $t-1$ or $t+1$ (we assume he could shift his consumption of one hour) and $\rho_{k}(t)=-\infty$ otherwise.

We solve the bilevel problem using Algorithm 4, implemented in Scilab. The computation took 9526 seconds on a single core of an Intel i5-4690 processor @ 3.5 GHz.

On Figures 6– 9, we show the evolution of the satisfaction of different kind of customers for different kind of contents without and with incentives. These results show that price incentives have an effective influence on the load, especially in the most loaded cells (the number of black regions in the space-time coordinates, in which the unsatisfaction of the users is critical, is considerably reduced). Moreover, Figure 10 reveals that the consumption of users is not only moved in time, but also in space: not only some consumption is moved from the peak hour to the night (off peak), but the surface of the dark grey region, representing the total download consumption in the cell over the whole day, is decreased, indicating that some part of the consumption has been shifted to other cells.

8 Conclusion

We presented here a bilevel model for price incentives in data mobile networks. We solved this problem by a decomposition method based on discrete convexity and tropical geometry. We finally applied our results to real data. In further work, we shall consider more general models: unfixed number of requests, nonlinear preferences of the customers, satisfaction functions of the provider taking into account the profit. Stochastic models shall also be considered in particular to take into account the partial information of the provider about the customers preferences and trajectories.

9 Acknowledgments

We thank the reviewers of our earlier work eytard2017bilevel for their remarks and comments. We also thank Orange for providing us real data for our experimental results.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Altman, E., Barman, D., El Azouzi, R., Ros, D., Tuffin, B.: Pricing differentiated services: A game-theoretic approach. Computer Networks 50 (7), 982–1002 (2006)
2(2) Baccelli, F., Cohen, G., Olsder, G., Quadrat, J.: Synchronization and Linearity. Wiley (1992)
3(3) Baldwin, E., Klemperer, P.: Tropical geometry to analyse demand. Tech. rep., Working paper, Oxford University (2012)
4(4) Bellman, R.: On a routing problem. Quarterly of applied mathematics 16 (1), 87–90 (1958)
5(5) Bonald, T., Feuillet, M.: Network performance analysis. John Wiley & Sons (2013)
6(6) Brotcorne, L., Labbé, M., Marcotte, P., Savard, G.: A bilevel model and solution algorithm for a freight tariff-setting problem. Transportation Science 34 (3), 289–302 (2000)
7(7) Butkovič, P.: Max-linear systems : theory and algorithms. Springer monographs in mathematics. Springer (2010)
8(8) Colson, B., Marcotte, P., Savard, G.: An overview of bilevel optimization. Annals of operations research 153 (1), 235–256 (2007)