Groups of Repairmen and Repair-based Load Balancing in Supermarket Models with Repairable Servers
Na Li, Quan-Lin Li, Zhe George Zhang

TL;DR
This paper applies mean-field theory to analyze supermarket queueing models with repairable servers, revealing how different repair groups impact system performance and providing a new approach for complex system analysis.
Contribution
It introduces a novel mean-field approach to analyze repairable server supermarket models, including fixed point analysis and numerical performance evaluation.
Findings
Impact of repair groups on system performance
Asymptotic independence of models established
Fixed points satisfy nonlinear equations
Abstract
Supermarket models are a class of interesting parallel queueing networks with dynamic randomized load balancing and real-time resource management. When the parallel servers are subject to breakdowns and repairs, analysis of such a supermarket model becomes more difficult and challenging. In this paper, we apply the mean-field theory to studying four interrelated supermarket models with repairable servers, and numerically indicate impact of the different repairman groups on performance of the systems. First, we set up the systems of mean-field equations for the supermarket models with repairable servers. Then we prove the asymptotic independence of the supermarket models through the operator semi-group and the mean-field limit. Furthermore, we show that the fixed points of the supermarket models satisfy the systems of nonlinear equations. Finally, we use the fixed points to give…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Probability and Risk Models · Simulation Techniques and Applications
Groups of Repairmen and Repair-based Load Balancing in Supermarket Models with
Repairable Servers
Na Li
Department of Industrial Engineering and Management
Shanghai Jiaotong University, Shanghai 200240, China
Quan-Lin Li
School of Economics and Management Sciences
Yanshan University, Qinhuangdao 066004, China
Zhe George Zhang
Department of Decision Sciences, Western Washington University,
Beedie School of Business, Simon Fraser University
Abstract
Supermarket models are a class of interesting parallel queueing networks with dynamic randomized load balancing and real-time resource management. When the parallel servers are subject to breakdowns and repairs, analysis of such a supermarket model becomes more difficult and challenging. In this paper, we apply the mean-field theory to studying four interrelated supermarket models with repairable servers, and numerically indicate impact of the different repairman groups on performance of the systems. First, we set up the systems of mean-field equations for the supermarket models with repairable servers. Then we prove the asymptotic independence of the supermarket models through the operator semigroup and the mean-field limit. Furthermore, we show that the fixed points of the supermarket models satisfy the systems of nonlinear equations. Finally, we use the fixed points to give numerical computation for performer analysis, and provide valuable observations on model improvement. Therefore, this paper provides a new and effective method in the study of complex supermarket models.
Keywords: Supermarket model; mean-field theory; fixed point; performance analysis; repairable server; reliability.
1 Introduction
In the last two decades considerable research attention has been paid to the study of supermarket models. Supermarket models are a class of interesting parallel queueing networks with dynamic and real-time adaptive control, for example, size-based routine selection, and information-based resource scheduling. Such a supermarket model can be applied to, such as, computer networks, manufacturing systems, transportation networks and healthcare systems. Since a simple supermarket model was discussed by Mitzenmacher [43], Vvedenskaya et al. [55] and Turner [53, 54], more studies have been done by, for instance, Vvedenskaya and Suhov [56], Graham [21, 22], Luczak and McDiarmid [37, 38], Bramson et al. [5, 9, 10], Li [28], Li et al. [30, 31] and Li and Lui [32], Gast et al. [19], Gast and Gaujal [20] and Mukhopadhyay et al. [45]. For the fast Jackson networks (or supermarket networks), readers may refer to Martin and Suhov [40], Martin [39] and Suhov and Vvedenskaya [51].
In many stochastic networks, servers subject to breakdowns and repairs always encounter in practical areas, such as, computer systems, communication networks, manufacturing systems, and transportation networks. Because system performance deteriorates quickly due to servers’ breakdowns and limited repair capacity, analyzing such a stochastic systems with repairable servers is not only important from theoretical perspective but also necessary from practical engineering. On this research line, important examples include Mitrany and Avi-Ttzhak [42], Neuts and Lucantoni [46], Kulkarni and Choi [26], Li et al. [36], Aissani and Artalejo [2], Núñez-Queija [47], Li et al. [34], Economou and Kantaa [15], Fiems et al. [17], Kamoun [23], and Krishnamoorthy et al. [25] for a survey.
It is interesting but difficult to discuss stochastic systems of parallel queues with unreliable servers, e.g., see an excellent survey by Adan et al. [1]. Up to now, the available results on parallel-queue systems with repairable servers are still very few. Andradottir et al. [3] applied a Markov decision process to compensating for failures with flexible servers. Martonosi [41] studied a dynamic server allocation at parallel queues with unreliable servers. Saghafian et al. [49] analyzed the dynamic control of unreliable flexible servers in a “W” network. Ravid et al. [48] considered the repair systems with exchangeable items and the longest queue mechanism. Stimulated by practical need of many distributed parallel systems, the study of supermarket models and work stealing models is highly paid attention on computer systems and communication networks. This motivates us in this paper to apply the mean-field theory to analyzing supermarket models with servers subject to breakdowns and repairs, which are a class of important complex reliability networks, and specifically, the different groups of repairmen make analysis of such a reliability network more difficult and challenging.
It is necessary to provide a simple survey for the mean-field theory. The mean-field equations and the asymptotic independence (or propagation of chaos) play an important role in the study of interacting particle systems, e.g., see Liggett [35] and Kipnis and Landim [24]. For the mean-field theory of complex stochastic systems, readers may refer to, for example, interacting Markov processes by Spitzer [50], Dawson [12], Sznitman [52] and Chen [11] and Li [29]; queueing networks by Baccelli et al. [4], Borovkov [7] and Mitzenmacher et al. [44]; work stealing models by Gast and Gaujal [18] and Li and Yang [33]; communication networks by Duffield [13], Benaim and Le Boudec [6], Duffy [14] and Bordenave et al. [8].
The main contributions of this paper are threefold. The first one is to describe and analyze a class of important complex reliability networks: Supermarket models with repairable servers, which play a key role in performance evaluation of computer systems and of communication networks. Notice that a supermarket model contains multiple repairable servers, thus the different groups of repairmen make analysis of the supermarket model more complicated. In the situation, this paper considers four interrelated supermarket models with repairable servers through observing two different arrival dispatched schemes and two different groups of repairmen. The second contribution is to apply the mean-field theory to studying the four interrelated supermarket models with repairable servers. This paper demonstrates such a mean-field analysis through the following three steps: (a) Providing a probability computation for setting up the systems of mean-field equations, (b) calculating the fixed points through the systems of nonlinear equations, and (c) giving performance analysis of the supermarket models with repairable servers and developing numerical computation for useful observation on model improvement. The third contribution is to provide a better example in order to demonstrate how to develop numerical solution in the study of complex supermarket models. Since the nonlinear structure of the mean-field equations makes a supermarket model almost impossible to find an analytic solution to the system of mean-field equations, it is a key to sufficiently develop numerical computation in performance evaluation of supermarket models. Based on this, numerical examples are used to provide valuable observations on how to improve performance of supermarket models either from system parameter optimization or from various resource deployment (e.g., arrival dispatched schemes, allocated service ability, and groups of repairmen).
Finally, note that this paper discusses a special class of supermarket models with unreliable servers, while their failed states and the groups of repairmen have influence on the arrival joining schemes. To analyze such a supermarket model, the most relevant references to this paper are Li et al. [30, 31] and Li and Lui [32] from two points of view: (1) The environment invariant factors were proposed to setting up systems of mean-field equations for complex supermarket models. As studied in Li et al. [31], this paper also analyzes a double dynamic routine selection scheme both for the arrival dispatched schemes and for the groups of repairmen. It is worthwhile to note that such a multiple dynamic routine selection scheme is a new and interesting topic in the study of supermarket models and of work stealing models.
The remainder of this paper is organized as follows. In Section 2, we first describe four interrelated supermarket models with repairable servers where customer arrivals make use of system information and repair ability is grouped in some different structures. Then we use the fraction vector to describe an infinite-dimensional Markov process for each supermarket model with repairable servers. In Sections 3, we provide two types of probability representations both for the arrival dispatched schemes by means of system information and for the repair ability grouped in different ways. In Sections 4, for each of the four interrelated supermarket models with repairable servers, we set up an infinite-dimensional system of mean-field equations. In Section 5, we discuss the fixed points for the systems of mean-field equations, and show that the fixed points can be determined by the systems of nonlinear equations. In Section 6, we first provide useful performance measures of the supermarket models with repairable servers. Then we use some numerical examples to make valuable observations on model improvement by means of performance numerical comparison. Section 7 concludes with a summary. The proofs of some key results are provided in Appendix A.
2 Supermarket Models with Repairable Servers
In this section, we first describe four interrelated supermarket models with repairable servers, where the arrival dispatched schemes make use of system information and the repair ability is grouped in different ways. Then we use the fraction vector (or empirical measure) to describe an infinite-dimensional Markov process for each supermarket model with repairable servers.
2.1 Model description
The arrival processes
Customers arrive at the system as a Poisson process with arrival rate for . Upon arrival, an arriving customer chooses servers from the servers independently and randomly. Then the customer will select one server (or queue) to join. Such a server selection is based on two different information observations as follows:
(A.1) Observing only the shortest queue*.* The arriving customer joins the shortest queue among the queues. If there is a tie, the customer makes the choice equally likely among the shortest queues of the same length.
(A.2) *Observing both the shortest queue and the status (working or repairing) of the *
selected servers. The arriving customer joins the shortest queue with the working server as higher priority than the server in repair among the selected servers.
**The service processes **
The service times at each server are i.i.d. and are exponential distributed with service rate .
**The repair processes **
Each server has an exponential life time with failure rate . When the server fails, it enters a failure state and undergoes the repair process immediately. The service of a customer interrupted by a server’s failure is resumed as soon as the server is repaired. We assume that the repaired server is as good as new and the service time is cumulative. To deploy the repair resource effectively, we consider three types of repair schemes as follows:
(R.1) Each server has one repairman*.* There are repairmen corresponding to the servers, and thus each server has a repairman of itself. The repair times are i.i.d exponential random variables with repair rate .
(R.2) A super large repairman*.*** **There is only one fast repairman whose repair time is exponentially distributed with repair rate and . This super repairman chooses servers from the servers randomly. If all the servers are working, then the repairman is idle; if at least one of the selected servers is failed, then the repairman repairs the failed server with the longest queue. If there is a tie, the repairman select one randomly.
(R.3) A large repairman and small repairmen for *. *There are a large repairman and small repairmen, where the repair time of the large repairman is exponentially distributed with the repair rate , and the repair time of each small repairman is exponentially distributed with repair rate .
Each of the small repairmen can repair one failed server at a time, if any; whilst the large repairman chooses servers from the servers independently and randomly. If all the selected servers are working, then the large repairman is idle; if at least one of the selected servers is failed but not repaired by small repairmen yet, then the large repairman repairs the failed server with the longest queue. If there is a tie, the repairman selects the failed server with the longest queue.
We assume that all the random variables defined above are independent of each other. Figure 1 shows a supermarket model with repairable servers and a large repairman.
Now, we construct four interrelated supermarket models with repairable servers, which are constructed by different combinations of (A.)** **and (R.) for as follows:
Model I ((A.)** **and (R.)): In this model, an arriving customer only needs to observe the queue lengths of the selected server and joins the shortest queue. There are repairmen corresponding to the servers, hence each server has one repairman of itself.
Model II ((A.)** **and (R.)): In this model, the queue selection rule is the same as Model I. However, there is only one super repairman who chooses servers from the servers independently and uniformly at random. If all the selected servers are working, then the repairman is idle; otherwise, the repairman repairs the failed server with the longest queue length.
Model III ((A.)** **and (R.)): In this model, an arriving customer observe not only the queue lengths of the selected servers, but also the states (working or repairing) of the selected servers. The customer then joins the shortest queue with working servers having higher priority than failed servers. There are repairmen corresponding to the servers, hence each server has one repairman of itself.
Model IV ((A.)** **and (R.)): In this model, the customer’s queue selection rule is the same as Model III. However, there is only a super repairman, which chooses servers from the servers independently and uniformly at random. If all the selected servers are working, then the repairman is idle; otherwise, the repairman repairs the failed server with the longest queue.
Remark 1
Actually, (R.) is a more general scheme of repair resource allocation, and its analysis can be completed through by modifying the mean-field equations in (R.) and (R.). Here, we do not consider (R.), and (R.) will be investigated in another paper.
Next, we shall provide a complete mathematical analysis for the four interrelated supermarket models, and present some numerical examples to show how the system information ((A.)** ** and repair resource allocation (R.) for ) affect performance of the supermarket models with repairable servers. Some insightful observations are made for designing and controlling the arrival, service and repair processes to improve the supermarket models.
2.2 An infinite-dimensional Markov process
Now, we use the empirical measure to provide an infinite-dimensional Markov process for studying each of the four interrelated supermarket models with repairable servers.
For , we denote by the numbers of working (or idle) servers with at least customers at time , and the numbers of failed servers with at least customers at time . Clearly, and for and .
We write that for ,
[TABLE]
and for
[TABLE]
which are the fractions of working (or idle) servers with at least customers and of failed servers with at least customers at time , respectively. Let
[TABLE]
[TABLE]
and
[TABLE]
Clearly, the state of the supermarket model of identical repairable servers is described as a stochastic process . Since the arrival process is Poisson, and the distributions of the service, life and repair times are all exponential, is an infinite-dimensional Markov process whose state space is given by
[TABLE]
For a fixed pair array with and , it is easy to see from the stochastic order that for and for . This gives
[TABLE]
and
[TABLE]
To study the infinite-dimensional Markov process , we write the expected fractions as follows
[TABLE]
and
[TABLE]
It is easy to see from (1) and (2) that
[TABLE]
and
[TABLE]
Let
[TABLE]
[TABLE]
and
[TABLE]
3 Two Types of Probability Representations
In this section, we provide two types of probability representations for customer arrivals by means of system information and for repair ability grouped in different ways. For notational simplicity, the two types of probability representations are denoted as the four pair control schemes: ((A.),** **(R.)) for . The probability representations are useful for establishing the systems of mean-field equations later.
For the supermarket models of identical repairable servers, to set up the probability representations, we only need to determine the expected change in the number of servers with at least customers over a small time period .
3.1 The arrival processes
This subsection provides the probability representations for the arrival processes, in which the two different cases of (A.) and (A.) are discussed. Note that the analysis of (A.) is similar to that of Li et al. [30]. To make our paper self-contained, we still present some computational details for (A.) and (R.). For (A.) and (R.), we only provide the main results.
(A.): Observing the Queue Length Only
To give the probability representations, we need to compute the rate that any arriving customer selects servers from the servers independently and uniformly at random, and joins the selected server with the shortest queue. Note that the arriving customer does not have the server status information (working or repair). Thus our computation for such a rate contains two steps as follows:
Step I: Entering one working server
In this step, the rate that any arriving customer joins a working server with the shortest queue length is given by
[TABLE]
where
[TABLE]
and for
[TABLE]
To derive the probabilities and for , Figure 2 shows the set decomposition of all possible events, and the probabilities are derived from the following three parts, that is,
[TABLE]
Part I: None of the selected servers is in repair. All selected servers are working for serving customers. In this case, the probability that any arriving customer joins a working server with the shortest queue length and the queue lengths of the other selected working servers are not shorter than is given by
[TABLE]
where is a binomial coefficient, is the probability that any arriving customer who can only choose one queue makes independent selections during the selected working servers with the queue length at time , and is the probability that any arriving customer who can only choose one queue makes independent selections during the selected working servers whose queue lengths are not shorter than at time .
Part II: For the selected servers, there is at least one working server with the shortest queue length , and there exist at least one server in repair while the queue length of each server in repair is more than customers. In this case, the probability that any arriving customer joins a working server with the shortest queue length ; and for the other selected servers, the queue lengths of the selected working servers are not shorter than , and there exist at least one server in repair while the queue length of each server in repair is more than customers, is given by
[TABLE]
Part III: For the selected servers, there is at least one working server with the shortest queue length and there is at least one server in repair with the shortest queue length . In this case, if there are the selected servers with the shortest queue length where there are m_{1}\working servers and servers in repair, then the probability that any arriving customer joins a working server is equal to . Therefore, the probability that any arriving customer joins a working server with the shortest queue length , the queue lengths of the other selected servers are not shorter than , and there are at least one working server with customers and at least one server in repair with customers is given by
[TABLE]
Step two: Entering one server in repair
This step can be dealt with similarly to that in Step one. The rate that any arriving customer joins one server in repair with the shortest queue length and the queue lengths of the other selected servers are not shorter than is given by
[TABLE]
where
[TABLE]
The following theorem simplifies expressions for the probabilities ; and for , while its proof is similar to that in Theorem 1 of Li et al. [30] and is omitted here. Note that the simplified expressions will be a key in our later study, for example, the system of mean-field equations can be simplified significantly and the fixed point can be computed effectively.
Theorem 1
[TABLE]
and for
[TABLE]
Using Theorem 1, we set
[TABLE]
and for
[TABLE]
(A.): Observing Both the Queue Length and the States of the Chosen Servers
In this case, the arriving customer has a priority for joining one working server with the shortest queue length.** **Upon arrival, each customer chooses servers from the servers independently and uniformly at random, and joins the one whose queue length is the shortest among the servers. If the servers with the shortest queue length contain at least one working server and at least one server in repair, then the arriving customer must randomly join one of the working servers with the shortest queue length. If there is a tie, the working servers with the shortest queue length are chosen randomly.
It is seen that the only difference from (A.) is that the arriving customer can not join one of the repairing servers with the shortest queue length when there exists at least one working server with the shortest queue length. Based on this, we have
**a) **The probabilities and for are the same as those in (A.).
b) Comparing with the probabilities in (A.), for (A.) we obtain that for
[TABLE]
Note that Part III of computing in (A.) is omitted by utilizing the information of (A.).
3.2 The repair processes
Now we provide the probability representations for the repair processes in two cases: (R.) and (R.).
(R.): Each Server Has One Repairman
In this case, there are repairmen corresponding to the servers, hence each server has one repairman. Since the repair time is exponentially distributed with repair rate , it is seen from Li et al. [34] if the service time of each server is of phase type with irreducible matrix representation , where
[TABLE]
then the repairable supermarket model is equivalent to a supermarket model with Poisson inputs and PH service times, as discussed in Li and Lui [28].
(R.): A Super Repairman
In this case, there is a single super repairman whose repair time is exponentially distributed with repair rate . The repairman chooses servers from the servers independently and uniformly at random. If all the selected servers are in working condition, the repairman is idle; if at least one of the selected servers is failed, then the repairman attends one failed server with the longest queue. If there is a tie, the repairman select a server randomly.
The rate that the repairman randomly chooses one of the failed servers with the longest queue length and the queue lengths of the other selected servers are not longer than is given by
[TABLE]
and for
[TABLE]
Using , we can further simplify
[TABLE]
and for
[TABLE]
4 The Mean-Field Equations
In this section, for each of the four interrelated supermarket models with repairable servers, we set up an infinite-dimensional system of mean-field equations. To this end, we present a detailed analysis only for the first model, while the other three models can be simply discussed on a similar line.
4.1 Model I ((A.)** **and (R.))
For (A.)** **and (R.), the probabilities , and for are given in (6), (7) and (12), which are further simplified in Theorem 1.
Now, we consider the service and repair processes. The rate that a customer leaves one server queued by customers is given by
[TABLE]
The rate that one working server with at least customers fails is given by
[TABLE]
The rate that one failed server with at least customers is repaired is given by
[TABLE]
Based on Equation (5), and Equations (15) to (17), we obtain
[TABLE]
In addition, it follows from (11) that
[TABLE]
Based on the similar analysis to (18) and (19), we can set up an infinite-dimensional system of mean-field equations satisfied by the expected fraction vector as follows:
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
and the initial conditions
[TABLE]
where
[TABLE]
[TABLE]
with
[TABLE]
It follows from Theorem 1 that
[TABLE]
and for
[TABLE]
Using and for , Equations (20) to (25) can further be simplified as
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
and the initial conditions
[TABLE]
4.2 Model II ((A.)** **and (R.))
In this model, for (R.) it follows from (13) and (14) that for
[TABLE]
Hence the dynamic routine selection scheme (R.) shows that for , will take the place of in the systems of mean-field equations (27) to (32). Based on this, we obtain
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
and the initial conditions
[TABLE]
4.3 Model III ((A.)** **and (R.))
In this model, the only difference is that an arriving customer cannot join the server in repair with the shortest queue length when there exists at least one working server with the shortest queue length. Thus we obtain
[TABLE]
and
[TABLE]
It is easy to see from (12), (39) and Theorem 1 that
[TABLE]
Thus (A.) indicates that needs to replace , in the systems of mean-field equations (27) to (32).
On the other hand, except of (40), we still have
[TABLE]
and for
[TABLE]
A similar analysis to the systems of mean-field equations (27) to (32), we obtain
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
and the initial conditions
[TABLE]
4.4 Model IV ((A.)** **and (R.))
Since (A.) needs replacing , and (R.) needs taking the place of . Thus we obtain
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
and the initial conditions
[TABLE]
Remark 2
From the four systems of mean-field equations, we find that to set up the systems of mean-field equations, two key rules must be followed as follows:
(1) If (A.) (A.), then takes the place of , , and
(2) if (R.) (R.), then replaces .
5 The Fixed Point
In this section, we discuss the fixed points for the systems of mean-field equations, and show that the fixed points can be determined by the systems of nonlinear equations. Specifically, we indicate that the nonlinear structure makes the analytical solution of the fixed points too complicated and even impossible. Since such a fixed point plays a key role in performance analysis of the supermarket models with repairable servers, it is interesting to develop numerical computation in the study of complex supermarket models.
5.1 A double limit
We discuss a double limit of the expected fraction vector function as and .
The following lemma provides a sufficient condition under which each of the four interrelated supermarket models with identical repairable servers is stable.
Lemma 1
Each of the four supermarket model with identical and repairable servers and two choice numbers is stable if , where .
Proof: If , then each of the four supermarket models of identical repairable servers is equivalent to a system of independent M/M/1 queues with repairable servers. From Li et al. [34], it is easy to see that such a repairable M/M/1 queue is stable if . Using a coupling method, as given in Theorems 4 and 5 of Martin and Suhov [40], it is clear that for a fixed number , each of the four supermarket models with identical repairable servers is stable if . This completes the proof.
The following theorem provides a useful property of the double limit of the expected fraction vector function , which is a key to establish the systems of nonlinear equations satisfied by the fixed point.
Theorem 2
If , then for each of the four interrelated repairable supermarket models, there exists a unique double limit
[TABLE]
**Proof: **This proof is given in Appendix A.
In fact, Theorem 2 also gives
[TABLE]
which justifies the interchange of the limit of the expected fraction vector function as and . This is necessary in many practical applications when using the stationary probabilities to give the effective approximation for performance of the supermarket models.
Let , where and . The row vector is called a fixed point of the expected fraction vector function if . Based on Theorem 2, we denote by for and for .
It is well-known that if is the fixed point of the expected fraction vector function , then
[TABLE]
this gives
[TABLE]
To set up a system of nonlinear equations, we write
[TABLE]
and for
[TABLE]
[TABLE]
[TABLE]
and for
[TABLE]
It is easy to check from Theorem 1 that
[TABLE]
and for
[TABLE]
[TABLE]
[TABLE]
[TABLE]
and for
[TABLE]
5.2 Model I ((A.)** **and (R.))
Taking and in both sides of the mean-field equations (27) to (32), it is easy to see that the fixed point satisfies the following system of nonlinear equations
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
To solve the system of nonlinear equations (53) to (57), the following lemma determines the boundary values , and , which are a key in our computation of the fixed point later.
Lemma 2
If , then
[TABLE]
[TABLE]
and
[TABLE]
Proof: It follows from (53) and (57) that
[TABLE]
and
[TABLE]
It follows from (54) to (56) that
[TABLE]
since . This gives
[TABLE]
and
[TABLE]
This completes the proof.
Let , and . Using (53) and (57), we take that , and . Let
[TABLE]
and the unique solution in to the nonlinear equation
[TABLE]
For , we set
[TABLE]
We assume that for , the pairs have been given iteratively, where is the unique solution in to the nonlinear equation . For , we write
[TABLE]
and is the unique solution in to the nonlinear equation . It is clear that and
The following theorem provides expression for the fixed point by means of the system of nonlinear equations (53) to (57).
Theorem 3
If then the fixed point is given by
[TABLE]
and
[TABLE]
**Proof: **Lemma 2 shows that , and .
We assume that for and , where and Then for , it follows from Equation (54) that
[TABLE]
It follows from Equation (56) that
[TABLE]
Let
[TABLE]
Then
[TABLE]
[TABLE]
by means of (56), and
[TABLE]
by means of . Note that is a continuous function for , there exists a unique positive solution in to the nonlinear equation . Hence,
By induction, this completes the proof.
Note that for the other three models with more complex complex nonlinear structures, we provide some discussion on the boundary conditions: and .
5.3 Model II ((A.)** **and (R.))
Taking and in both sides of the mean-field equations (33) to (37), it is easy to see that the fixed point satisfies the following system of nonlinear equations
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
It follows from (65) and (66) that
[TABLE]
and from (67) and (68) that for
[TABLE]
which, together with (69), follows
[TABLE]
It follows from (65), (66) and (70) that
[TABLE]
From (70), (71) and (66), we find that is the minimal nonnegative solution to the following nonlinear equation
[TABLE]
Also, is given.
5.4 Model III ((A.)** **and (R.))
Taking and in both sides of the mean-field equations (41) to (45), we obtain that the fixed point satisfies the following system of nonlinear equations
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
Now, we discuss the boundary conditions of the fixed point. It follows from (72) and (76) that
[TABLE]
and
[TABLE]
It follows from (73) to (75) that
[TABLE]
Let
[TABLE]
This gives
[TABLE]
It is easy to check that , and is increasing in . Therefore if , and if . At the same time, is decreasing in .
5.5 Model IV ((A.)** **and (R.))
Taking and in both sides of the mean-field equations (47) to (51), it is easy to see that the fixed point satisfies the following system of nonlinear equations
[TABLE]
[TABLE]
for
[TABLE]
and
[TABLE]
with the boundary condition
[TABLE]
From the similar analysis to the boundary conditions in Model III, we obtain
[TABLE]
and the similar analysis to that in Model II leads to
[TABLE]
and is the minimal nonnegative solution to the following nonlinear equation
[TABLE]
Also, we get that .
6 Performance Analysis and Numerical Observations
In this section, we first provide useful performance measures of the four interrelated supermarket models with repairable servers. Then we use some numerical examples to make valuable observations on model improvement by means of performance numerical comparison.
6.1 Performance measures
(a) The mean of stationary queueing length
Let be the stationary queue length of any server in each of the four supermarket models. Then
[TABLE]
(b) The variance of stationary queueing length
It is easy to check that
[TABLE]
since
[TABLE]
(c) The steady-state availability and failure frequency
Let and be the steady-state availability and failure frequency in any repairable server, respectively. Then
[TABLE]
and
[TABLE]
(d) The steady-state mean-field flow balancing
Since the mean-field theory plays an important role in the study of supermarket models, a flow balancing in the supermarket models is called a mean-field flow balancing. In every supermarket model, the steady-state mean-field throughput is given by
[TABLE]
Thus the the steady-state mean-field input-output difference is defined as
[TABLE]
Clearly, this supermarket model can have a steady-state mean-field flow balancing if .
6.2 Numerical observations
Now, we use some numerical examples to show how the major performance measures depend on some crucial parameters of the systems. These numerical examples are organized in three groups for different purposes: (1) examining the mean and the variance ; (2) observing the availability and the failure frequency ; and (3) discussing the mean-field flow balancing . At the same time, it is worth noting that in the numerical examples, (A.) and (A.) represent routing customers and (R.) and (R.) represent organizing of repair resource.
The following seven numerical examples are based on a set of system parameters of , , and . It is easy to check that .
**Example 1: **Show in Models III and IV for a comparison of deployment of repair resource. In this example with (A.), we consider Models III and IV and observe how choice numbers and affect , the stationary queue length. Figure 3 shows how changes with when the arrival rate . It is observed that while increases with , it decreases with either or . Also, is more effective than in terms of reducing .
**Example 2: ** Focus on to compare (A.) with (A.)
In this example, we demonstrate how (A.) improves the performance under (A.) in terms of . Figure 4 shows the as a function of the arrival rate with . It is observed that (A.) can effectively reduce compared with (A.). This implies that using more system information can improve the system performance.
**Example 3: ** Show in Models III and IV for comparing repair resource deployment.
In this example with (A.), we focus on how and affect , the variance of stationary queue length. Figure 5 illustrates how changes with the arrival rate with . We observe that the mean queue length decreases with either or . Also, is more effective than in terms of reducing .
**Example 4: ** Observe under (A.) or (A.)
In this example, Figure 6 shows how changes on the arrival rate with . It is revealed that (A.) can effectively reduce under (A.).
Example 5: Examine the steady-state availability in Models III and IV
In this example with (A.), Figure 7 shows that while the steady-state availability decreases with , it increases with either or . Thus, and can help increase the steady-state availability.
Example 6: Investigate the steady-state failure frequency in Models III and IV
In this example with (A.), Figure 8 shows that increases with both and or .
Finally, we provide a numerical example to show the steady-state flow balancing in the study of supermarket models.
Example 7: Observe the steady-state mean-field flow balancing
In this example with (A.), we show how the steady-state mean-field input-output difference depends on the arrival rate with and .
Figure 9 indicates that if and , the steady-state mean-field input-output difference , and it increases with . However, , which implies that the repairable supermarket model has the steady-state mean-field flow balancing for .
From the numerical analysis above, we may conclude that the system information (i.e., server in working or repair condition and queue length) for the arriving customer and the deployment of the repair resource can effectively improve the system performances of the supermarket models.
7 Concluding Remarks
In this paper, we apply the mean-field theory to studying effects of a double dynamic routine selection scheme (for the arrival dispatched schemes and for the groups of repairmen) on performance of the four interrelated supermarket models with repairable servers. We first provide a probability method of setting up the infinite-dimensional systems of mean-field equations. Then we prove asymptotic independence of the supermarket models with repairable servers. Based on this, we discuss the fixed points which are computed by means of the systems of nonlinear equations. Finally, we provide useful performance measures of the supermarket models, and use some numerical examples to make valuable observations on model improvement via using system information and deploying repair resource. Our results reveal effects of utilizing system information for customer’s joining decisions as well as reorganization of repair resource on performance of the supermarket models. Along with this line, there are a number of interesting directions for future research, for example:
- •
analyzing non-Poisson inputs, such as, Markovian arrival processes (MAPs), and renewal processes;
- •
studying non-exponential service time distributions, for example, general distributions, matrix-exponential distributions and heavy-tailed distributions;
- •
discussing the bulk arrival processes, and the bulk service processes; and
- •
developing effective algorithms for computing the fixed points in the study of complex supermarket models.
Acknowledgements
The first two authors were supported by the National Natural Science Foundation of China under grant No. 71471160, No. 71671158 and No. 71471114, and the Fostering Plan of Innovation Team and Leading Talent in Hebei Universities under grant No. LJRC027.
Appendix A: The Proof of Theorem 2
In this appendix, for the four interrelated supermarket models with repairable servers, we provide a simple outline of the proof of Theorem 2. To that end, it is a key to use the operator semigroup to provide a mean-field limit for the sequence of Markov processes who asymptotically approaches a single trajectory identified by the unique and global solution to an infinite-dimensional system of mean-field equations. Readers may refer to Li et al. [30] for more details with respect to the proof of such a mean-field limit.
For the vector where and , we write
[TABLE]
and
[TABLE]
At the same time, for the vector where and , we set
[TABLE]
and
[TABLE]
Obviously, and .
In the infinite-dimensional vector space , we take a metric
[TABLE]
for . Note that under the metric the infinite-dimensional vector space is complete, separable and compact.
For simplicity of description, here we only study the sequence of Markov processes in the first supermarket model with repairable servers, while the other three models can be analyzed similarly without any difficulty.
For the first supermarket model with repairable servers, the Markov process is described as
[TABLE]
where acting on functions is the generating operator of the Markov process , and
[TABLE]
for , and
[TABLE]
[TABLE]
and
[TABLE]
where stands for a row vector with the th entry be one and all the other entries be zero, and
[TABLE]
for
[TABLE]
Therefore, for and the function we obtain
[TABLE]
The operator semigroup of the Markov process is defined as , where if , then for and
[TABLE]
Note that is the generating operator of the operator semigroup , it is easy to see that for .
To analyze the limiting behavior of the sequence , of the Markov processes, two formal limits for the sequence of the generating operators and for the sequence of the semigroups are expressed as and for , respectively. It follows from (86) that as
[TABLE]
The following theorem applies the operator semigroup to provide the mean-field limiting process for the sequence of Markov processes, and indicates that this sequence of Markov processes asymptotically approaches a single trajectory identified by the unique and global solution to the system of mean-field equations. This proof is omitted here. Readers may refer to Li et al. [30] for more details.
Theorem 4
Let be continuous functions . Then for any
[TABLE]
The convergence is uniform in for any
Finally, we provide some interpretation on Theorem 4. If in probability, then Theorem 4 shows that is concentrated on the trajectory . This indicates the functional strong law of large numbers for the time evolution of the fraction of each state of this supermarket model, thus the sequence of Markov processes converges weakly to the expected fraction vector as , that is, for any
[TABLE]
Note that the limits are necessary for using the stationary probabilities of the limiting process to give an effective approximate performance of this supermarket model.
The Proof of Theorem 2
In the remainder of this Appendix, we discuss some useful limits of the fraction vector as and whose purpose is to give the proof of Theorem 2.
The following theorem gives the limit of the vector as , that is,
[TABLE]
This proof is omitted here. Readers may refer to Li et al. [30] for more details.
Theorem 5
If , then for any
[TABLE]
Furthermore, there exists a unique probability measure on , which is invariant under the map , that is, for any continuous function and
[TABLE]
Also, is the probability measure concentrated at the fixed point .
The following theorem indicates the weak convergence of the sequence of stationary probability distributions for the sequence of Markov processes to the probability measure concentrated at the fixed point . This proof is omitted here. Readers may refer to Li et al. [30] for more details.
Theorem 6
(1) If , then for a fixed number , the Markov process is positive recurrent, and has a unique invariant distribution .
(2) weakly converges to , that is, for any continuous function
[TABLE]
Based on Theorems 5 and 6, we obtain a useful relation as follows
[TABLE]
Therefore, we have
[TABLE]
Clearly, the above analysis completes the proof of Theorem 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] I.J.B.F. Adan, O.J. Boxma and J.A.C. Resing (2001). Queueing models with multiple waiting lines. Queueing Systems 37 , 65–98.
- 2[2] A. Aissani and J.R. Artalejo (1998). On the single server retrial queue subject to breakdowns. Queueing Systems 30 , 309–321.
- 3[3] S. Andradottir, H. Ayhan and D.G. Down (2007). Compensating for failures with flexible servers. Operations Research 55 , 753–768.
- 4[4] F. Baccelli, F.I. Karpelevich, M.Y. Kelbert, A.A. Puhalskii, A.N. Rybko and Y.M. Suhov (1992). A mean-field limit for a class of queueing networks. Journal of Statistical Physics 66 , 803–825.
- 5[5] M. Bramson, Y. Lu and B. Prabhakar (2010). Randomized load balancing with general service time distributions. In: Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems , pp. 275–286.
- 6[6] M. Benaim and J.Y. Le Boudec (2008). A class of mean-field interaction models for computer and communication systems. Performance Evaluation 65 , 823–838.
- 7[7] K.A. Borovkov (1998). Propagation of chaos for queueing networks. Theory of Probability & Its Applications 42 (No. 3) , 385–394.
- 8[8] C. Bordenave, D. Mc Donald and A. Proutiere (2010). A particle system in interaction with a rapidly varying environment: mean-field limits and applications. Networks and Heterogeneous Media 5 , 31–62.
