Dynamic scheduling in a partially fluid, partially lossy queueing system

Kiran Chaudhary; Veeraruna Kavitha; Jayakrishnan Nair

arXiv:1904.06480·cs.PF·December 24, 2021

Dynamic scheduling in a partially fluid, partially lossy queueing system

Kiran Chaudhary, Veeraruna Kavitha, Jayakrishnan Nair

PDF

Open Access

TL;DR

This paper analyzes a single server queue with two job classes under dynamic scheduling policies in a fluid limit, revealing a pseudo-conservation law and characterizing Pareto-optimal performance trade-offs.

Contribution

It introduces a fluid limit analysis for a partially fluid, partially lossy queueing system, establishing a pseudo-conservation law and identifying Pareto-complete scheduling policies.

Findings

01

Performance of each class is characterized under dynamic policies.

02

A pseudo-conservation law links class performances to eager class blocking probabilities.

03

The Pareto frontier of performance vectors is characterized, with a class of Pareto-complete policies identified.

Abstract

We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking probability is the relevant performance metric for the eager class, the tolerant class seeks to minimize its mean sojourn time. In this paper, we discuss the performance of each class under dynamic scheduling policies, where the scheduling of both classes depends on the instantaneous state of the system. This analysis is carried out under a certain fluid limit, where the arrival rate and service rate of the eager class are scaled to infinity, holding the offered load constant. Our performance characterizations reveal a (dynamic) pseudo-conservation law that ties the performance of both the classes to the standalone blocking probabilities of the eager…

Figures7

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1: Summary of Notations

Basic Notations
$λ ϵ :$ Poisson arrival rate of eager customer.
$λ_{τ} :$ Poisson arrival rate of tolerant customer.
$B_{ϵ} :$ generic $ϵ$ job size.
$B_{ϵ}^{μ_{ϵ}} \overset{d}{=} \frac{B_{ϵ}^{1}}{μ_{ϵ}} :$ $ϵ$ job-size at scale $μ_{ϵ}$ .
$ρ_{ϵ} :$ the load factor eager class.
$A_{τ} :$ inter-arrival time of tolerant customers.
$B_{τ} :$ Exponential $τ$ job-size.
$ν_{i} :$ tolerant service rate when $i$ -sup-policy used for eager.
Embedded chain
$p_{i}^{μ_{ϵ}}$ : probability that tolerant arrival is before departure, in $μ_{ϵ}$ -system at $τ$ occupancy $i$ .
$q_{i}^{μ_{ϵ}} = 1 - p_{i}^{μ_{ϵ}}$ : probability that $τ$ -departure is before arrival.
$p_{i}^{\infty}$ : probability that $τ$ -arrival is before departure, under limit system at $τ$ occupancy $i$ .
$q_{i}^{\infty} = 1 - p_{i}^{\infty}$ : probability that $τ$ -departure is before arrival. in limit system.
${\tilde{π}}_{i}^{μ_{ϵ}} :$ steady state probability that embedded $τ$ -chain is in state $i$ , in $μ_{ϵ}$ -system.
${\tilde{π}}_{i} :$ steady state probability that embedded $τ$ -chain is in state $i$ , in limit system.
Continuous time process
$π_{i}^{μ_{ϵ}} :$ steady state probability that $τ$ -process is in state $i$ , in $μ_{ϵ}$ system.
$π_{i} :$ steady state probability that $τ$ -process is in state $i$ , in limit system.
$Ω_{i}^{μ_{ϵ}} (\cdot) :$ unused service process by $ϵ$ class.
$ρ_{i} = ρ_{i}^{ϕ} := \frac{λ_{τ}}{μ_{τ} ν_{i}} :$ tolerant load when $i$ -sub -policy ( $ϕ$ ) used for eager.
$Υ_{i}^{μ_{ϵ}} :$ time require to finish $B_{τ}$ amount of work using $Ω_{i}^{μ_{ϵ}} (\cdot)$ .
$P_{B_{i}} :$ standalone $τ$ blocking probability when $i$ sub-policy used for eager class.
$E^{μ_{ϵ}} [N] :$ stationary expected number of $τ$ -customer at $μ_{ϵ}$ system.
$P_{B}^{μ_{ϵ}} :$ overall $ϵ$ blocking probability at $μ_{ϵ}$ system.
$P_{B}^{\infty} :$ overall $ϵ$ blocking probability in limit system.
$ℬ_{i}^{μ_{ϵ}} :$ $ϵ$ -busy cycle when sub-policy $i$ is used.
$𝒮_{i}^{μ_{ϵ}} :$ total service available for $τ$ -class during $ℬ_{i}^{μ_{ϵ}}$ .
$ℬ_{i, k}^{μ_{ϵ}} :$ $k$ -th $ϵ$ -busy cycle when sup-policy $i$ is used.
$𝒮_{i, k}^{μ_{ϵ}} :$ total service available for $τ$ -class during $ℬ_{i, k}^{μ_{ϵ}}$ .

Equations369

t \to \infty lim \frac{Ω _{j}^{μ_{ϵ}} ( t )}{t} \to ν_{j} := 1 - ρ_{ϵ} (1 - P_{B}_{j}) almost surely.

t \to \infty lim \frac{Ω _{j}^{μ_{ϵ}} ( t )}{t} \to ν_{j} := 1 - ρ_{ϵ} (1 - P_{B}_{j}) almost surely.

\displaystyle\sup_{t\leq W}\left|\Omega^{\mu_{\epsilon}}_{j}(t)-\nu_{j}t\right|\to 0\ a.s.\mbox{ for any finite $W,$ and }

\displaystyle\sup_{t\leq W}\left|\Omega^{\mu_{\epsilon}}_{j}(t)-\nu_{j}t\right|\to 0\ a.s.\mbox{ for any finite $W,$ and }

\displaystyle\Upsilon_{j}^{\mu_{\epsilon}}\ \stackrel{{\scriptstyle a.s.}}{{\to}}\ \frac{B_{\tau}}{\nu_{j}}\mbox{, both for any initial $\epsilon$-state, }

π_{i}

π_{i}

ν_{j} := (1 - ρ_{ϵ} (1 - P_{B}_{j})), \mbox f or j \geq 0.

ν_{j} := (1 - ρ_{ϵ} (1 - P_{B}_{j})), \mbox f or j \geq 0.

π_{i}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty π_{i} \mbox f or a l l i \geq 0,

π_{i}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty π_{i} \mbox f or a l l i \geq 0,

E^{μ_{ϵ}} [N] ⟶ μ_{ϵ} \to \infty i = 1 \sum \infty i π_{i},

E^{μ_{ϵ}} [N] ⟶ μ_{ϵ} \to \infty i = 1 \sum \infty i π_{i},

P_{B}^{μ_{ϵ}} := t \to \infty lim \frac{N _{B}^{μ_{ϵ}} ( t )}{N _{A}^{μ_{ϵ}} ( t )} .

P_{B}^{μ_{ϵ}} := t \to \infty lim \frac{N _{B}^{μ_{ϵ}} ( t )}{N _{A}^{μ_{ϵ}} ( t )} .

P_{B}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty j = 1 \sum \infty P_{B}_{j} π_{j} =: P_{B}^{\infty},

P_{B}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty j = 1 \sum \infty P_{B}_{j} π_{j} =: P_{B}^{\infty},

q_{j}^{μ_{ϵ}} = 1 - p_{j}^{μ_{ϵ}} := P (A_{τ} > Υ_{j}^{μ_{ϵ}} (B_{τ})) .

q_{j}^{μ_{ϵ}} = 1 - p_{j}^{μ_{ϵ}} := P (A_{τ} > Υ_{j}^{μ_{ϵ}} (B_{τ})) .

x_{j}^{μ_{ϵ}} - x_{j} < ϵ \mbox f or a l l μ_{ϵ} > \overset{μ}{ˉ} \mbox an df or a l l j \geq 0.

x_{j}^{μ_{ϵ}} - x_{j} < ϵ \mbox f or a l l μ_{ϵ} > \overset{μ}{ˉ} \mbox an df or a l l j \geq 0.

p_{j}^{μ_{ϵ}} \to μ_{ϵ} \to \infty p_{j}^{\infty} := \frac{λ _{τ}}{λ _{τ} + ν _{j} μ _{τ}} \mbox an d q_{j}^{μ_{ϵ}} := 1 - p_{j}^{μ_{ϵ}} \to μ_{ϵ} \to \infty q_{j}^{\infty} := \frac{ν _{j} μ _{τ}}{λ _{τ} + ν _{j} μ _{τ}} .

p_{j}^{μ_{ϵ}} \to μ_{ϵ} \to \infty p_{j}^{\infty} := \frac{λ _{τ}}{λ _{τ} + ν _{j} μ _{τ}} \mbox an d q_{j}^{μ_{ϵ}} := 1 - p_{j}^{μ_{ϵ}} \to μ_{ϵ} \to \infty q_{j}^{\infty} := \frac{ν _{j} μ _{τ}}{λ _{τ} + ν _{j} μ _{τ}} .

\displaystyle\bigg{|}q_{j}^{\mu_{\epsilon}}-\frac{\nu_{j}\mu_{\tau}}{\lambda_{\tau}+\nu_{j}\mu_{\tau}}\bigg{|}<\epsilon.

\displaystyle\bigg{|}q_{j}^{\mu_{\epsilon}}-\frac{\nu_{j}\mu_{\tau}}{\lambda_{\tau}+\nu_{j}\mu_{\tau}}\bigg{|}<\epsilon.

\tilde{π}_{i}^{μ_{ϵ}} = \frac{p _{0}^{μ_{ϵ}} p _{1}^{μ_{ϵ}} \dots p _{i - 1}^{μ_{ϵ}}}{q _{1}^{μ_{ϵ}} q _{2}^{μ_{ϵ}} \dots q _{i}^{μ_{ϵ}}} \tilde{π}_{0}^{μ_{ϵ}}, \mbox f or i \geq 1,

\tilde{π}_{i}^{μ_{ϵ}} = \frac{p _{0}^{μ_{ϵ}} p _{1}^{μ_{ϵ}} \dots p _{i - 1}^{μ_{ϵ}}}{q _{1}^{μ_{ϵ}} q _{2}^{μ_{ϵ}} \dots q _{i}^{μ_{ϵ}}} \tilde{π}_{0}^{μ_{ϵ}}, \mbox f or i \geq 1,

\tilde{π}_{0}^{μ_{ϵ}} = \frac{1}{1 + \sum _{k = 1}^{\infty} \frac{p _{0}^{μ_{ϵ}} p _{1}^{μ_{ϵ}} \dots p _{k - 1}^{μ_{ϵ}}}{q _{1}^{μ_{ϵ}} q _{2}^{μ_{ϵ}} \dots q _{k}^{μ_{ϵ}}}} .

\tilde{π}_{0}^{μ_{ϵ}} = \frac{1}{1 + \sum _{k = 1}^{\infty} \frac{p _{0}^{μ_{ϵ}} p _{1}^{μ_{ϵ}} \dots p _{k - 1}^{μ_{ϵ}}}{q _{1}^{μ_{ϵ}} q _{2}^{μ_{ϵ}} \dots q _{k}^{μ_{ϵ}}}} .

\tilde{π}_{i}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty \tilde{π}_{i} \mbox f or a l l i \geq 0

\tilde{π}_{i}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty \tilde{π}_{i} \mbox f or a l l i \geq 0

A^{\infty} = {(P_{B, ϕ}^{\infty}, E_{ϕ}^{\infty} [N]) : ϕ \mbox i s an S M p o l i cy} .

A^{\infty} = {(P_{B, ϕ}^{\infty}, E_{ϕ}^{\infty} [N]) : ϕ \mbox i s an S M p o l i cy} .

P_{B}_{i}

P_{B}_{i}

E_{ϕ}^{μ_{ϵ}} [N] ⟶ μ_{ϵ} \to \infty E_{ϕ}^{\infty} [N]

E_{ϕ}^{μ_{ϵ}} [N] ⟶ μ_{ϵ} \to \infty E_{ϕ}^{\infty} [N]

P_{B, ϕ}^{μ_{ϵ}} ⟶ μ_{ϵ} \to \infty P_{B, ϕ}^{\infty}

h_{j}^{ϕ}

ϕ min P_{B, ϕ} \mbox s u c h t ha t E_{ϕ} [N] \leq C \mbox, i . e ., e q u i v a l e n tl y

ϕ min P_{B, ϕ} \mbox s u c h t ha t E_{ϕ} [N] \leq C \mbox, i . e ., e q u i v a l e n tl y

ϕ min i = 0 \sum \infty d_{i} \frac{h _{i}^{ϕ}}{1 + \sum _{l \geq 1} h _{l}^{ϕ}} \mbox s u c h t ha t i = 0 \sum \infty i \frac{h _{i}^{ϕ}}{1 + \sum _{l \geq 1} h _{l}^{ϕ}} \leq C .

\overset{ρ}{ˉ} := \frac{λ _{τ}}{μ _{τ} ( 1 - ρ _{ϵ} ( 1 - d ))}, \mbox an d \underline{ρ} = \frac{λ _{τ}}{μ _{τ} ( 1 - ρ _{ϵ} ( 1 - d ˉ ))} .

\overset{ρ}{ˉ} := \frac{λ _{τ}}{μ _{τ} ( 1 - ρ _{ϵ} ( 1 - d ))}, \mbox an d \underline{ρ} = \frac{λ _{τ}}{μ _{τ} ( 1 - ρ _{ϵ} ( 1 - d ˉ ))} .

d_{i}^{*} = 1_{{i < L^{*}}} \underline{d} + 1_{{i = L^{*}}} d^{*} + 1_{{i > L^{*}}} \overset{ˉ}{d}

d_{i}^{*} = 1_{{i < L^{*}}} \underline{d} + 1_{{i = L^{*}}} d^{*} + 1_{{i > L^{*}}} \overset{ˉ}{d}

L^{*} =

L^{*} =

d^{*} =

E_{(L, d)} [N]

E_{(L, d)} [N]

P_{B, (L, d)}

A_{ϵ, k}^{μ_{ϵ}} = \frac{A _{ϵ, k}^{1}}{μ _{ϵ}} \mbox an d B_{ϵ, k}^{μ_{ϵ}} = \frac{B _{ϵ, k}^{1}}{μ _{ϵ}} .

A_{ϵ, k}^{μ_{ϵ}} = \frac{A _{ϵ, k}^{1}}{μ _{ϵ}} \mbox an d B_{ϵ, k}^{μ_{ϵ}} = \frac{B _{ϵ, k}^{1}}{μ _{ϵ}} .

B_{j, k}^{μ_{ϵ}} = B_{j, k}^{1} / μ_{ϵ},

B_{j, k}^{μ_{ϵ}} = B_{j, k}^{1} / μ_{ϵ},

S_{j, k}^{μ_{ϵ}} = S_{j, k}^{1} / μ_{ϵ} .

N_{A}^{μ_{ϵ}} (B_{j, 1}^{μ_{ϵ}}) = N_{A}^{1} (B_{j, 1}^{1}) \mbox an d N_{B_{j}}^{μ_{ϵ}} (B_{j, 1}^{μ_{ϵ}}) = N_{B_{j}}^{1} (B_{j, 1}^{1}),

N_{A}^{μ_{ϵ}} (B_{j, 1}^{μ_{ϵ}}) = N_{A}^{1} (B_{j, 1}^{1}) \mbox an d N_{B_{j}}^{μ_{ϵ}} (B_{j, 1}^{μ_{ϵ}}) = N_{B_{j}}^{1} (B_{j, 1}^{1}),

E [O^{2}] \leq e^{θ} E [X^{2}] (e^{θ} - 1 + \frac{e ^{2} - 1}{2}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Queuing Theory Analysis · Advanced Wireless Network Optimization · Scheduling and Optimization Algorithms

Full text

Dynamic scheduling in a partially fluid, partially lossy

queueing system††thanks: A preliminary version of this work was presented at the WiOpt conference, 2019 [1].

Kiran Chaudhary Kiran Chaudhary and Veeraruna Kavitha are with the Industrial Engineering and Operations Research (IEOR) Department at IIT Bombay.

Veeraruna Kavitha22footnotemark: 2

Jayakrishnan Nair Jayakrishnan Nair is with the Electrical Engineering Department at IIT Bombay; he acknowledges support from DST and CEFIPRA.

Abstract

We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking probability is the relevant performance metric for the eager class, the tolerant class seeks to minimize its mean sojourn time. In this paper, we analyse the performance of each class under dynamic scheduling policies, where the scheduling of both classes depends on the instantaneous state of the system. This analysis is carried out under a certain fluid limit, where the arrival rate and service rate of the eager class are scaled to infinity, holding the offered load constant. Our performance characterizations reveal a (dynamic) pseudo-conservation law that ties the performance of both the classes to the standalone blocking probabilities associated with the scheduling policies for the eager class. Further, the performance is robust to other specifics of the scheduling policies. We also characterize the Pareto frontier of the achievable region of performance vectors under the same fluid limit, and identify a (two-parameter) class of Pareto-complete scheduling policies.

1 Introduction

In this paper, we analyse a single server queueing system with two heterogeneous customer classes. One class of customers is eager—they require service to commence (almost) immediately upon arrival. The performance of the eager class is captured by the blocking probability, i.e., the long run fraction of eager customers that are blocked. The second class of customers is tolerant—these customers can tolerate delays and may be queued. The performance of this class is captured via the mean response time of the tolerant customers.

Service systems of this kind are motivated by modern cellular networks, which handle voice calls (which must be either admitted or dropped upon arrival) as well as data traffic (which can be queued). Another motivation comes from super-markets, where it is common practice to provide prioritized service to customers with fewer items via dedicated express counters; these customers have limited patience, and may balk/renege if service is not provided almost immediately. However, such part-loss, part-queueing multi-class service systems are analytically intractable even under the simplest scheduling disciplines (see [2] and the references therein). In this paper, we derive tractable approximations of the performance experienced by each class using a certain fluid limit, referred to as the short-frequent-jobs (SFJ) limit.

The SFJ limit corresponds to scaling the arrival rate as well as the service rate of the eager class to infinity, such that the offered load is held constant. This gives rise to a time-scale separation between the two classes, the eager class operating at a faster time-scale. Under the SFJ limit, we obtain a closed form characterization of the performance of both classes under a broad class of dynamic policies that allow the admission control of the eager class and the scheduling of both classes to be dependent on the current state of the system.111From here on, we follow the convention that admission control (if used) is included in the eager scheduling policy. Interestingly, a dynamic pseudo-conservation law follows from this characterization (in the SJF limit)—the performance of both classes depends only on the standalone blocking probabilities (resulting when a single eager scheduling policy is used oblivious to the tolerant state) associated with the eager scheduling schemes employed for each occupancy level of the tolerant queue. In particular, the performance does not depend on the specific scheduling policies that generate those blocking probabilities, as well as on the details of the tolerant scheduler (subject to work conservation and serial, non-anticipative processing). Conservation laws typically allow one to compute the performance of a complex system in terms of the performance of simpler ones. In our case, once the relevant standalone blocking probabilities are known (these can usually be computed easily as they result from the analysis of a single-class loss system), one can compute the performance of both the classes.

We further analyse the Pareto frontier of performance vectors achievable under the class of dynamic schedulers, which defines the set of efficient operating points for the system. Remarkably, we are able to identify a Pareto-complete family of scheduling policies (we call a family of schedulers Pareto-complete if it spans the entire Pareto frontier over its parameter space). This family, parametrized by $(L,d),$ where $L\in\mathbb{N}$ and $d\in(0,1),$ blocks eager customers with the minimum blocking probability when the tolerant occupancy is less than $L,$ with probability $d$ when the occupancy equals $L,$ and with the maximum blocking probability when it exceeds $L.$

Finally, via numerical experiments, we show that our performance characterizations under the SFJ limit are extremely accurate in the pre-limit (i.e., for moderate values of arrival and service rates of the eager class). This shows that our approximations, which are provably accurate under the SFJ fluid limit, are also applicable in practice.

The remainder of this paper is organised as follows. After a review of the related literature below, we describe our system model and state some preliminary results in Section 2. We also give some examples of dynamic schedulers in Section 2. Under the SFJ limit, we characterize the performance of the tolerant class and the eager class in Section 3. We formally define the dynamic achievable region in Section 4, and demonstrate the Pareto-complete family of dynamic schedulers in Section 5. We conclude the paper in Section 6. A summary of our notation can be found in Table 1 in Appendix A.

Related Literature

The present paper is a follow-up of our prior work [3, 4], which analyses the same heterogeneous queueing system under the SFJ limit for a class of (partially) static scheduling policies. Under this class of policies, the scheduling of the eager class is oblivious to the state of the tolerant queue, with the tolerant queue simply utilizing the service capacity left unused by the eager class. Clearly, this class of schedulers is restrictive. In the present paper, we consider general dynamic policies, where eager scheduling depends on the occupancy of the tolerant queue. This generalization, which requires a non-trivial analysis, results is a substantial expansion of the achievable region of feasible performance vectors (as is shown in Sections 4 & 5). Moreover, the generalization to dynamic policies necessitates the identification of a Pareto-complete family of schedulers (which is the goal of Section 5); within the restricted class of static schedulers analysed in [3, 4], it turns out that all policies are efficient.

Aside from [4, 3], the only prior work we are aware of that analyses a part-queueing, part-loss service system is [5]. In this paper, the authors obtain the performance metrics for all classes in closed form, assuming exponential inter-arrival and service times for all classes, under a certain static priority scheduling discipline. However, we note that [5] does not attempt to address the tradeoff between the performance of the two classes, which is central to the present work.

From an application standpoint, this paper is also related to the considerable literature on sharing the capacity of a cellular system between voice and data traffic; for example, see [6, 7, 8]. In this line of work, both voice and data classes are treated as lossy, the focus being on characterizing the blocking probability of each class under different (static and dynamic) admission rules. However, to the best of our knowledge, these papers do not analyse the achievable region of performance vectors, or characterize its Pareto frontier.

We also note that there is a well-developed literature on multiclass queueing systems with multiple tolerant classes on a single server (e.g., conservation laws, pioneered by [9]). The achievable region is well understood in such a ‘homogeneous’ multi-class setting [10, 11]. Interestingly, in this case, it is known that the static and dynamic achievable regions coincide (see [3]), in contrast with the ‘heterogeneous’ multi-class setting considered here, where we see that the static achievable region is a strict subset of the dynamic achievable region. Moreover, the achievable region in the homogeneous setting is its own Pareto frontier (i.e., all points of the achievable region are efficient) under work conserving policies, also in contrast with the heterogeneous setting considered here.

This paper is an extended version of [1], which analyses a restricted class of dynamic scheduling policies, wherein the number of distinct eager sub-policies is assumed to be finite. In this paper, we extend the analysis to general dynamic policies, and further provide complete proofs of all results.

2 System Description

We consider a single server queueing system with two job (a.k.a. customer) classes: eager customers (also denoted as $\epsilon$ -customers) demand service immediately upon arrival, whereas tolerant customers (also denoted as $\tau$ -customers) can wait in a queue (of infinite capacity) to be served.222We discuss the generalization where eager customers have limited patience and abandon/renege after a short wait time in Section 6; see also [4]. The $\tau$ -customers can be interrupted either partially (i.e., their service rate may be reduced) or completely by ${\epsilon}$ -customers, but not by other $\tau$ -customers. Without loss of generality, we assume a unit server speed. We assume that $\epsilon$ -customers (respectively, ${\tau}$ -customers) arrive according to a Poisson process with rate $\lambda_{{\epsilon}}$ (respectively, $\lambda_{\tau}$ ). The sequence of job sizes (a.k.a. service requirements) for both the classes is i.i.d., with $B_{\epsilon}$ denoting a generic ${\epsilon}$ job size, and $B_{\tau}$ denoting a generic ${\tau}$ job size. Throughout, we assume that $B_{\tau}$ is exponentially distributed with mean $1/\mu_{\tau},$ and that $E\left[B_{\epsilon}^{2}\right]<\infty.$ Let $\mu_{\epsilon}:=1/E\left[B_{\epsilon}\right].$

2.1 Dynamic schedulers

We consider dynamic scheduling, wherein the scheduling policy for each class depends upon the state (occupancy) of both the classes. An eager scheduling (sub)policy (a set of rules that assign fractions of server capacity to various eager customers, which may also include the admission control rules) is chosen depending upon the tolerant state. The overall dynamic scheduler is characterized by a sequence of such eager sub-policies, one corresponding to each value of tolerant occupancy—the tolerant queue in turn utilizes the service capacity left unused by the eager class in a work-conserving manner. Our dynamic schedulers can thus be viewed to be of nested type: a top-level policy chooses the sub-policy used for scheduling the ${\epsilon}$ -class based on the occupancy (state) of the ${\tau}$ -class. The sub-policy in turn determines, based on the number of eager customers in the system, the admission rule for subsequent arrivals, and the service rate to allocate to each existing eager job. It is important to note that these nested schedulers are not restrictive; rather this is simply a convenient representation of dynamic schedulers; details and examples follow. As a result, the service processes of the two classes are interdependent (unlike in the case of static scheduling as considered in [4, 3]).

Specifically, let $(X_{\tau}(t),X_{\epsilon}(t))$ represent the number of tolerant and eager customers in the system at time $t$ . We consider stationary Markov scheduling policies that determine the service plan for both classes (including, possibly, admission control of eager customers), depending on the tuple $(X_{\tau}(t),X_{\epsilon}(t)).$ Such Markov policies are known to be sufficient for a wide range of sequential decision problems (see [16]); in the context of our model, they are sufficient if one additionally assumes that eager customers have exponentially distributed service requirements. As an example, consider the following policy parameterized by $(L,N)\in\mathbb{N}^{2}$ : a) when $(X_{\tau}(t),X_{\epsilon}(t))=(j,c)$ with $c\leq N$ , and $j\leq L$ then each ${\epsilon}$ -customer is served with rate (speed) $1/N$ , while longest waiting ${\tau}$ -customer is served at rate $(N-c)/N$ ; and b) if $j>L$ , the longest waiting ${\tau}$ -customer is served with the full service capacity (i.e., at rate 1), and no further ${\epsilon}$ -customers are admitted. It is convenient to view such policies as being nested; for the above example, when $X_{\tau}(t)\leq L$ the eager class is served according to one sub-policy, while it is served according to a second sub-policy when tolerant occupancy is more than $L$ . Under the former sub-policy, the eager class is served as an $M/G/N/N$ queue (with each ‘server’ taken to have speed $1/N$ ), while under the latter sub-policy, no eager customers are admitted. There is of course the issue of how to deal with the existing eager customers in the system when there is an arrival/departure in the ${\tau}$ queue, leading to change in the eager sub-policy. In the above example, this corresponds to the handling of any eager customers that remain when the ${\tau}$ occupancy increases from $L$ to $L+1.$ This issue is addressed later in this section (see Assumption A.1 and the discussion in Section 2.3).

Note that the eager sub-policy switches whenever there is an arrival/departure in the tolerant queue, though the further details of each sub-policy depend only on the eager occupancy. We typically refer to the ${\epsilon}$ -sub-policy implemented when there are $j$ tolerant jobs in the system as sub-policy $j.$

${\epsilon}$ -schedulers: Note that while the occupancy of the tolerant queue dictates the selection of the eager sub-policy, the sub-policies themselves are oblivious to the state of the tolerant queue. We make the following additional assumptions. Some examples of schedulers that satisfy these are presented in Section 2.3:

A.1

To simplify the transition from one ${\epsilon}$ -sub-policy to the next, we assume that all ${\epsilon}$ -customers are dropped when there is an arrival/departure in the ${\tau}$ -queue. 2. A.2

The scheduling of each sub-policy depends only on the number of ${\epsilon}$ -jobs present in the system. 3. A.3

On arrival, an eager job is either admitted into service or gets dropped/blocked. If admitted, an eager job receives service at a minimum rate/speed of $c_{\min}>0$ until its departure under any sub-policy.

Assumption A.1 is a technical condition required in our proofs. In Section 5, we show that this assumption has negligible influence on the performance of either class under our partial fluid limit.333 Under the SFJ limit, this ‘flushing’ of the ${\epsilon}$ -system is only performed at a bounded rate (since arrivals/departures in the ${\tau}$ -queue occur at a bounded rate), while the arrival rate of the ${\epsilon}$ system scales to infinity. Thus, we expect that this assumption will not impact the blocking probability of the eager class (this is also evident from the Monte Carlo simulation based study presented in Section 5). We are also able to relax Assumption A.1 for the special case of exponentially distributed eager jobs (see Theorem 4). As discussed before, Assumption A.2 is not restrictive; it allows all (non-size based) Markov policies. We provide examples of such schedulers in Section 2.3. Assumption A.3 implies that the eager class operates as a loss system. Indeed, the requirement of a minimum service rate/speed $c_{\min}>0$ captures the ‘eagerness’ of eager customers. This assumption also implies that there exists a uniform upper bound $\mathcal{K}$ on the number of ${\epsilon}$ -jobs in the system at any time across sub-policies (since the server is assumed to have a unit speed).

The above assumptions imply the following condition (proved as Lemma 3 in Appendix B.1); the same is stated as a (redundant) assumption to emphasize that the uniform convergence required in our analysis is predominantly due to the following uniform bound on the eager busy cycles:

A.4

Across sub-policies, the second moment of the ${\epsilon}$ -busy cycle, defined as the interval between the start of two successive ${\epsilon}$ busy periods, is uniformly bounded.

${\tau}$ -schedulers: Next, we state our assumptions on the scheduling policy of the tolerant class.

B.1

The $\tau$ -scheduler is work conserving, i.e., it utilizes all the service capacity left unused by $\epsilon$ -jobs, so long as the $\tau$ -queue is non-empty. 2. B.2

The $\tau$ -jobs are served in a serial fashion, i.e., $\tau$ -jobs cannot pre-empt one another. 3. B.3

The $\tau$ -scheduler is blind to the size of $\tau$ -jobs.

Assumption B.1 implies that the tolerant class experiences a time varying service process, which depends on both the $\tau$ -state as well as the $\epsilon$ -state. Assumptions B.2-3 imply that we consider $\tau$ -schedulers which are non-pre-emptive and non-anticipative, for instance, first come first served (FCFS), last come first served (LCFS), and random order of service (see [12, Chapter 29]).

We require another assumption (B.4) regarding the stability of the ${\tau}$ -queue under the SFJ limit, when ${\epsilon}$ -customers employ a single sub-policy (irrespective of the ${\tau}$ -state). We provide the required background on these schedulers, define formally the SFJ scaling, and state the resulting pseudo-conservation law (see [4] for more details), after which we state Assumption B.4.

2.2 ${\tau}$ -static schedulers and background

We now consider the special case of ${\tau}$ -static scheduling, where a single ${\epsilon}$ -sub-policy is used at all times (irrespective of ${\tau}$ -state); this case was analysed in [4]. Let ${P_{B}}_{j}$ represent the blocking probability (long run fraction of losses) of the ${\epsilon}$ -class, if sub-policy- $j$ is used in a ${\tau}$ -static manner ( ${P_{B}}_{j}$ was referred to as the standalone blocking probability of sub-policy $j$ in Section 1); we also refer to these as ${\tau}$ -static blocking probabilities.

Short-Frequent Jobs (SFJ) Scaling: Under the SFJ scaling (as in [4]), we let $\lambda_{\epsilon}\rightarrow\infty$ and $\mu_{\epsilon}\rightarrow\infty,$ such that $\rho_{\epsilon}:={\lambda_{\epsilon}}/{\mu_{\epsilon}}$ remains constant. This corresponds to scaling the arrival as well as the service rate of the eager class to infinity proportionately, so that the offered load (the long term rate at which work arrives into the system) is held constant. We use $\mu_{\epsilon}$ as the scale parameter for this partial scaling. Specifically, we scale the job size distribution of the eager class as, $B^{\mu_{\epsilon}}_{\epsilon}\stackrel{{\scriptstyle d}}{{=}}B^{1}_{\epsilon}/\mu_{{\epsilon}}$ , where $B^{\mu_{\epsilon}}_{\epsilon}$ denotes a generic eager job size at scale $\mu_{\epsilon}$ and $\stackrel{{\scriptstyle d}}{{=}}$ is equality in distribution. This scaling (plus Poisson arrivals) under A.2 ensures that at scale ${\mu_{\epsilon}},$ the occupancy process of the ${\epsilon}$ -class gets time-scaled (fast-forwarded) by ${\mu_{\epsilon}};$ the details can be found in Appendix A. Note that the tolerant workload remains unscaled. Thus, the SFJ scaling may be viewed as a time-scale separation, with the eager class operating at a faster time-scale.

Static Pseudo Conservation: Let $\Omega^{\mu_{\epsilon}}_{j}(t)$ represent the total amount of server capacity left unused by the $\epsilon$ -customers in time interval $[0,t]$ , under sub-policy $j$ operating in a ${\tau}$ -static manner. Note that $\Omega^{\mu_{\epsilon}}_{j}(t)$ is the (cumulative) service process seen by the ${\tau}$ -system. Then by [4, Lemma 1], for all ${\mu_{\epsilon}},$ the asymptotic (in time) growth rate of $\Omega^{\mu_{\epsilon}}_{j}(t)$ satisfies:

[TABLE]

In other words, the long run time average service rate seen by the ${\tau}$ -queue equals $\nu_{j},$ which depends only on the blocking probability ${P_{B}}_{j}$ of the eager class, and not on the specific ${\epsilon}$ -sub-policy that produced that blocking probability. Further under the SFJ limit, the service process seen by the ${\tau}$ -class becomes uniform. Specifically it follows from [4, Theorem 1] that, as ${\mu_{\epsilon}}\to\infty$ in the SFJ limit,

[TABLE]

where $\Upsilon_{j}^{\mu_{\epsilon}}$ denotes the time required to finish $B_{\tau}$ amount of work using the service process $\Omega^{\mu_{\epsilon}}_{j}(\cdot)$ . This uniformity of the service process under SFJ limit enables a closed form characterization of the performance of the ${\tau}$ -class (see [4, Theorems 2,3]). A key feature of the above results is a pseudo-conservation law that expresses the performance of the tolerant class purely in terms of the blocking probability of the eager class, independent of the underlying ${\epsilon}$ -policy that produced the blocking probability. *We show an analogous pseudo-conservation for dynamic scheduling policies in this paper. *

Finally, we state the following assumption, which ensures that the ${\tau}$ -queue remains stable under each eager sub-policy, when they are applied in a ${\tau}$ -static manner.

B.4

There exists $\delta_{\tau}>0$ such that $\rho_{j}:=\frac{\lambda_{\tau}}{\mu_{\tau}\nu_{j}}<1-\delta_{\tau}\mbox{ for all }j.$

Assumption B.4 guarantees that the ${\tau}$ -system is stable in the dynamic setting as well, as is shown by Lemma 2.

2.3 Some example models

We begin with the description of one example system that satisfies our assumptions, in which the system capacity is not completely transferred to one class at any time, but rather a fraction of it is used by each ${\epsilon}$ -customer, whilst the left is utilized by one ${\tau}$ -customer.

Capacity Division (CD- $(p,K)$ ) policy: Each ${\epsilon}$ -customer uses $(1/K)$ part of the service capacity. If there are $0\leq\ell\leq K$ number of ${\epsilon}$ -customers receiving service, then ${\epsilon}$ -customers are served at a net service rate of $({\ell}/{K})$ , while the ${\tau}$ -customer in service (if any) is served at rate $((K-\ell)/K)$ . This continues up to $K$ ${\epsilon}$ -customers, and any further ${\epsilon}$ -arrival departs without service. Note that whenever an existing ${\epsilon}$ -customer departs, the service rate of the ${\tau}$ -customer gets increased by $1/K$ . Of course, $1/K\geq c_{\min}$ to satisfy Assumption A.3. Further there is a prior admission control on $\epsilon$ -arrivals, they are admitted with probability $p$ independent of all other events. This is the description of the ${\epsilon}$ -sub-policy (as in [3, 4]). Now the top-level policy varies the probability of admission $p$ (and possibly the maximum ${\epsilon}$ occupancy $K$ ) based on the occupancy of the ${\tau}$ -queue.

We refer to the above ${\epsilon}$ -sub-policy as Capacity Division or briefly as the CD- $(p,K)$ policy. Note that the CD- $(p,K)$ policy captures a multi-server setting for the eager class. While the ${\epsilon}$ -scheduler need not be work conserving, the ${\tau}$ -class uses all the left over capacity. Another example of an ${\epsilon}$ -sub-policy is the following.

Limited Processor Sharing (LPS- $(p,K)$ ) policy: This sub-policy, denoted by LPS- $(p,K)$ (as in [3, 4]), admits an incoming eager job into the system with probability $p,$ so long as the number of eager jobs already in service is less than or equal to $K.$ The entire service capacity of the server is shared equally between the eager jobs in service (i.e., each eager job gets served at rate $1/\ell$ when there are $\ell$ jobs in service). As before, $1/K\geq c_{\min}.$ Note that under the LPS- $(p,K)$ policy, the tolerant class receives service only when there are no eager jobs in the system.

Top-level policies: The top-level policy can chose any one of these sub-policies for any ${\tau}$ -state. For example, when the ${\tau}$ -occupancy is greater than a certain threshold $L,$ one may allocate fewer individual servers to ${\epsilon}$ -customers (using, for example, the CD policy), while one may allocate the entire capacity to the ${\epsilon}$ -class and may serve them in LPS mode when the ${\tau}$ -occupancy is smaller.

Switching between sub-policies: Note that under Assumption A.1, the transition between sub-policies, triggered by an arrival/departure in the ${\tau}$ queue, is simplified—any left over eager jobs at the time of the transition are simply ‘flushed’. This simplifies our analysis, and as argued before, would have a negligible impact on the performance of both classes in the SFJ scaling regime. In practice, a more natural implementation strategy would be to simply let the new sub-policy dictate the scheduling of the existing eager jobs in the system after the change of the ${\tau}$ occupancy. This implies a possible re-assignment of the service rates of eager customers at the time of the ${\tau}$ -transition. We prove formally that doing this does not affect our main results under the SFJ limit, for the special case of exponentially distributed eager job sizes (see Theorem 4).

3 Performance Characterization under the SFJ limit

In this section, we characterize the performance of the ${\tau}$ -class and the ${\epsilon}$ -class under the SFJ limit.

Recall that in our system, both classes are scheduled in a dynamic manner, with the ${\epsilon}$ -sub-policy being selected based on current ${\tau}$ -occupancy, and the ${\tau}$ -queue in turn being served (in a work conserving fashion) using the unused service process of the ${\epsilon}$ -sub-policy. Thus, the two classes experience random, time varying, state-dependent, and also inter-dependent service processes. This makes a precise performance evaluation intractable; indeed, no closed form characterization of the performance of the tolerant class is possible even in the simplified setting where the eager class is scheduled by a policy that is oblivious to the tolerant state; see [2, 4]. In this section, we show that under the (partially fluid) SFJ scaling, tractable performance characterizations are possible.

To provide intuition for the form of our results, recall that the timescale separation resulting from the SFJ scaling results in the tolerant queue obtaining, in the limit, service at a ‘steady’ rate of $\nu_{j}:=1-\rho_{\epsilon}\left(1-P_{B_{j}}\right)$ when there are $j$ tolerant jobs in the system. Here, $P_{B_{j}}$ is the standalone, or ${\tau}$ -static blocking probability associated with sub-policy $j.$ One might then anticipate that the limiting performance of the tolerant class is described in terms of a system with a state-dependent service rate, i.e., where the service rate is a (deterministic) function of the queue occupancy. We prove that this is indeed the case. Before we state our main results, we first describe this limit system, which we refer to as a state-dependent service rate M/M/1 (SDSR-M/M/1) queue.

3.1 State-dependent service rate M/M/1 queue

An SDSR-M/M/1 queue sees the same arrival process as an M/M/1 queue: job arrivals are according to a Poisson process (of rate $\lambda$ ). Further the job sizes are independent and exponentially distributed with mean $1/\mu$ . However, unlike the standard M/M/1 queue, the SDSR-M/M/1 queue has a state dependent service rate (a.k.a. server speed). Specifically, the server operates with service rate $\nu_{j}$ if the number of jobs in the queue (including the job in service) equals $j.$ Thus, the SDSR-M/M/1 queue is parametrized by $(\lambda,\mu,\bm{\nu})$ , where $\bm{\nu}=\{\nu_{j}\}_{j\geq 0}$ is the vector of service rates.

The number of jobs in the SDSR-M/M/1 queue evolves as a continuous time Markov process with birth-death structure (see Figure 1).

Moreover, Assumption **B.**4 ensures that this Markov process is positive recurrent. Thus, its steady state distribution can be obtained by elementary techniques. In particular, the stationary distribution ${\bm{\pi}}:=\{\pi_{i}\}_{i=0}^{\infty}$ , is given by (see [13]):

[TABLE]

3.2 Main results

Recall that the scheduler is specified by a sequence of eager sub-policies one for each tolerant occupancy; $P_{B_{j}}$ being the ${\tau}$ -static probability of the eager sub-policy used under tolerant occupancy $j$ . The performance of the tolerant class under the SJF limit, and under any scheduler, is characterized as follows. The steady state occupancy of the tolerant queue converges, in distribution, and also in expectation, to the corresponding quantities of an SDSR-M/M/1 system parameterized by $(\lambda_{\tau},\mu_{\tau},\bm{\nu}).$

Theorem 1.

[Stationary distribution of ${\tau}$ -occupancy]*

Assume $\bm{A}$ .1-4 and $\bm{B}$ .1-4. Under the SFJ limit, the steady state number ${\tau}$ -occupancy converges in distribution to the steady state occupancy in an SDSR-M/M/1 $(\lambda_{\tau},\mu_{\tau},(\nu_{1},\nu_{2},\cdots))$ queue, with*

[TABLE]

That is,

[TABLE]

where ${\pi}_{i}$ is given by (3).

Theorem 2.

[Stationary expected ${\tau}$ -occupancy]*

Assume $\bm{A}$ .1-4 and $\bm{B}$ .1-4. Under the SFJ limit, the stationary expected number of ${\tau}$ -customers converges to that of the limit system described in Theorem 1. That is,*

[TABLE]

where ${\pi}_{i}$ is given by (3).

Given that ${\bm{\pi}}:=\{\pi_{i}\}_{i=0}^{\infty}$ is the stationary distribution of a simple birth-death CTMC, Theorems 1 and 2 provide closed form characterizations of the limiting performance of the tolerant class under the SFJ scaling. It is important to note that while the statements of Theorems 1 and 2 might seem intuitive, their proofs are rather involved. In particular, they rely crucially on a uniformity in the convergence of the service process seen by the tolerant queue under the SFJ limit, across ${\epsilon}$ -sub-policies; see Subsection 3.3.

Next, we describe the performance of the eager class under the SFJ limit, which is captured by its blocking probability, i.e., the long fraction of eager customers blocked. Formally, the blocking probability is defined as follows. Let $N_{B}^{\mu_{\epsilon}}(t)$ denote the total number of eager customers that are blocked (i.e., returned without service), before time $t$ and $N_{A}^{\mu_{\epsilon}}(t)$ be the total number of eager customers arrived in the same time interval, for the system at scale $\mu_{\epsilon}.$ Then the blocking probability of the eager class is defined as the long run fraction of customers blocked, i.e.,

[TABLE]

Note that eager jobs are blocked by different sub-policies, which operate dynamically based on the occupancy of the tolerant queue. Thus, it is not a priori even clear that the limit in (4) exists almost surely. That it does, and that the eager blocking probability converges under the SFJ limit to a convex combination of the ${\tau}$ -static blocking probabilities $\{P_{B_{j}}\},$ weighted by the (limiting) long run fractions of time the ${\tau}$ -system spends in each state, is established by the following theorem.

Theorem 3.

**[Blocking probability of eager class]

Assume A.1-4, B.1-4. Then the steady state blocking probability of $\epsilon$ -jobs in SFJ limit is given by:**

[TABLE]

where ${\pi}_{i}$ is given by (3).

A key takeaway from Theorems 1–3 is that the performance of both classes under the SFJ limit can be characterized in terms of the ${\tau}$ -static blocking probabilities $\{P_{B_{j}}\}_{j\geq 0}$ . The probabilities $\{P_{B_{j}}\}_{j\geq 0}$ themselves are typically easy to compute, since they involve the analysis of a single-class (stationary) loss system. Thus, by virtue of Theorems 1–3, we have the following conservation law.

Dynamic Pseudo Conservation: Under the SFJ limit, the performance of both classes depends only on the ${\tau}$ -static blocking probabilities $\{P_{B_{j}}\}_{j\geq 0}$ and not the specifics of the ${\epsilon}$ -sub-policies that produced these blocking probabilities.

Finally, since Assumption A.1 may seem unreasonable, we show below that the conclusions of Theorems 1-3 hold even without this assumption, if eager job sizes are taken to be exponentially distributed.

Theorem 4.

Assume A.2-4 and B.1-4. Also assume that the ${\epsilon}$ -service times are exponentially distributed. The conclusions of Theorems 1-3 hold, once the transitions between eager sub-policies are handled as described in Section 2.3 (i.e., following an arrival/departure in the tolerant queue, the new eager sub-policy dictates the scheduling of the eager queue from then on).

Theorem 4 shows that Assumption A.1 is benign, i.e., it does not affect the performance of either class under the SFJ limit. As expected, the proof of Theorem 4 is considerably more involved as compared to the proofs of Theorems 1-3. Specifically, Assumption A.1 allows the evolution of the ${\tau}$ queue (across arrival/departure epochs) to be described by a one-dimensional birth-death Markov chain (see Section 3.3). This is no longer possible in the absence of Assumption A.1; the system evolution has to be captured via a certain two-dimensional Markov chain, whose long-run time averages must be shown to match those corresponding to the former birth-death chain as ${\mu_{\epsilon}}\rightarrow\infty$ (see Appendix D).

We note that while our performance characterization is derived under the SFJ limit, we show in Section 5 that they provide accurate approximations of the performance experienced in the pre-limit. The remainder of this section is devoted to highlighting the main steps in the proofs of Theorems 1 and 2. Most details are relegated to Appendix B. The proof of Theorem 3 is presented in Appendix C. The proof of Theorem 4 can be found in Appendix D.

3.3 Analysing tolerant performance under the SFJ limit

We now sketch the main steps in the proofs of Theorems 1 and 2 (i.e., under Assumption A.1). We characterize the ${\tau}$ -performance by first analysing the ${\tau}$ -queue at its arrival/departure epochs. Let $X_{n}$ denote the occupancy of the ${\tau}$ -queue immediately following the $n$ th arrival/departure. Under Assumption A.1 (dropping of existing ${\epsilon}$ -customers at ${\tau}$ -transitions) and because of exponentially distributed tolerant job sizes, $\{X_{n}\}$ is a discrete-time Markov chain with birth-death (BD) structure; see Figure 2. Let $p_{j}^{\mu_{\epsilon}}$ and $q_{j}^{\mu_{\epsilon}}$ , respectively, denote the forward and backward transition probabilities of this (pre-limit) tolerant BD chain, when there are $j$ tolerant customers in the system. Let $A_{\tau}$ and $B_{\tau}$ represent, a generic inter-arrival time associated with the tolerant queue, and a generic tolerant job size, respectively. Also, recall $\Upsilon_{j}^{\mu_{\epsilon}}(B_{\tau})$ denotes the time required to finish $B_{\tau}$ amount of work using the service process $\Omega^{\mu_{\epsilon}}_{j}(\cdot)$ (see Section 2.2), then we have:

[TABLE]

Observe by A.1 that the job completion times, distributed as $\Upsilon_{j}^{\mu_{\epsilon}}(B_{{\tau}})$ , are i.i.d. across all those customers that were served when the tolerant occupancy equals $j$ . Moreover, if such a tolerant service is interrupted by a ${\tau}$ -arrival, the remaining completion time is distributed as $\Upsilon_{j+1}^{\mu_{\epsilon}}(B_{{\tau}}),$ thanks to the memoryless property of tolerant job sizes and A.1.

Our first step is to prove that the above transition probabilities converge to the transition probabilities associated with the embedded BD chain corresponding to the $(\lambda_{\tau},\mu_{\tau},(\nu_{1},\nu_{2},\cdots))$ SDSR-M/M/1 system. Moreover, we show that the above convergence takes place uniformly over sub-policies $j.$ Before stating this result, we recall the definition of uniform convergence (which we denote by $u.f.$ ):

Definition: $\left[\textbf{Uniform$ (u.f.) $convergence }\right]$ A parameterized family of sequences $\{x_{j}^{{\mu_{\epsilon}}}\}_{j\geq 0}$ is said to be uniformly convergent to the limit $\{x_{j}\}_{j\geq 0}$ as ${\mu_{\epsilon}}\rightarrow\infty$ if, for every $\epsilon>0$ there exists $\bar{\mu}$ , such that

[TABLE]

Lemma 1.

$\left[\textbf{Uniform convergence of transition probabilities}\right]$ * i) If $p_{j}^{\mu_{\epsilon}}$ denotes the probability of a ${\tau}$ -arrival before a ${\tau}$ -departure in the (pre-limit) system when the number of tolerant customers in the system equals $j$ , then*

[TABLE]

ii) The convergence in (5) occurs u.f. over the sub-policies $j$ . Precisely, for every $\epsilon>0$ there exists $\bar{\mu}>0$ , such that for all $\mu_{\epsilon}\geq\bar{\mu}$ and for all $j\geq 0,$ we have:

[TABLE]

The proof of Lemma 1 is provided in Appendix B.2.

The next step is to show that the embedded BD chain characterizing the evolution of the occupancy of tolerant queue is positive recurrent for large enough ${\mu_{\epsilon}}.$

Lemma 2.

There exists $\bar{\mu}>0$ such that for ${\mu_{\epsilon}}>\bar{\mu},$ the (embedded) Markov chain $\{X_{n}\}$ is positive recurrent.

The proof of Lemma 2 is provided in Appendix B.3. Lemma 2 also implies that for large enough ${\mu_{\epsilon}},$ the $\tau$ -queue is stable and has a well defined stationary behaviour.444The occupancy of the tolerant queue evolves as a semi-Markov process in continuous time. The regularity of this process (i.e., each state is visited only finitely often in any finite interval of time with probability 1) follows by noting that the number of state transitions is lower bounded by those in a standard M/M/1 queue with arrival rate $\lambda_{\tau}$ and service rate $\mu_{\tau}$ . The positive recurrence of this process follows from Theorem 5.9 in [13], noting that mean residence time in any state is uniformly upper bounded by $1/\lambda_{\tau}.$ The stationary distribution for ${\mu_{\epsilon}}>\bar{\mu}$ can be calculated using standard techniques and is given by (see, for example, [14]),

[TABLE]

with $\tilde{\pi}_{0}^{\mu_{\epsilon}}$ defined as:

[TABLE]

Now, given that the embedded BD chain capturing the evolution of the tolerant queue is positive recurrent for large enough ${\mu_{\epsilon}}$ (Lemma 2), and has transition probabilities that converge (under the SFJ limit) uniformly to those of the embedded chain of the $(\lambda_{\tau},\mu_{\tau},(\nu_{1},\nu_{2},\cdots))$ SDSR-M/M/1 queue (Lemma 1), we can now establish convergence of the stationary distribution of this embedded process to that of the SDSR-M/M/1 system as well.

Theorem 5.

$\left[\textbf{Stationary number in the embedded tolerant chain}\right]$ *

Under SFJ limit, the stationary distribution of the embedded (BD) chain corresponding to the tolerant Markov process given by equation (7) and (8) converges to that of the SDSR-M/M/1 queue with variable service rates. That is,*

[TABLE]

where, $\tilde{\pi}_{i}$ is given by (3).

Note that Theorem 5 deals with the limiting behavior of the embedded chain corresponding to the (continuous time) occupancy process of the tolerant queue. Our main results Theorems 1 and 2 can be proved from Theorem 5 by invoking a relationship between the stationary distribution of an embedded Markov chain and that of the corresponding continuous time process (see Lemma 8 in Appendix B.5); the details can be found in Appendices B.5 and B.6.

4 Dynamic Achievable Region

A queuing system can be analysed using several performance metrics; for example, number of customers in the system, sojourn time (the total time spent by the customer), waiting time (of the customer before the service starts), fraction of the customers blocked (in a loss system), etc. The achievable region of a multi-class system is defined as the region of all possible vectors (one component for one class) of the relevant performance metrics.555In this paper we consider the achievable region corresponding to schedulers which satisfy Assumptions **A.**1-4 and **B.**1-4. In our model, corresponding to the eager class we have a lossy system, thus we consider blocking probability as the performance metric. For the tolerant class, one can consider the steady state expected number of customers in the system as the performance metric.

By Lemma 2, the system is stable for all $\mu_{\epsilon}\geq\bar{\mu}$ , for some $\bar{\mu}<\infty$ . Thus for all such $\mu_{\epsilon}$ , by Little’s Law, $E^{\mu_{\epsilon}}[S],$ the stationary expected sojourn time of a typical $\tau$ -customer, and $E^{\mu_{\epsilon}}[N],$ the stationary expected number of $\tau$ -customers in the system are related as $E^{\mu_{\epsilon}}[N]=\lambda_{\tau}E^{\mu_{\epsilon}}[S].$ Thus it is sufficient to consider any one of these metrics.

Stationary Markov top-level policies: In any general sequential decision problem, a Stationary Markov (SM) policy is a sequence of decisions, in which one decision is chosen for each value of the state and the same decision is applicable in any time slot. In our case we consider the top-level policies among the Stationary Markov (SM) family. This means, a top level policy $\phi$ is a sequence of ${\epsilon}$ -sub-policies, and that if the ${\tau}$ -state equals $j$ at any time slot, then the $j$ -th sub-policy of $\phi$ is used for scheduling the ${\epsilon}$ -class.

As understood from Theorems 2-3, the only characteristic of the ${\epsilon}$ -sub-policies that influences the system (dynamic) performance are the ${\tau}$ -static blocking probabilities $\{{P_{B}}_{j}\}_{j}$ , obtained when respective sub-policies are used in ${\tau}$ -static manner. Thus to define an efficient dynamic system, one effectively needs to choose (based on the ${\tau}$ -state), one among these blocking probabilities (and no further details of the sub-policy are important). This is a consequence of the ‘dynamic pseudo-conservation’ mentioned in the previous section.

Any Stationary Markov (SM) top-level policy is generally given by a sequence of $\epsilon$ -sub-policies, one for each $\tau$ -state. However, in view of the above observation, a stationary Markov policy can be thought of as a sequence of $\epsilon$ -blocking probabilities (derived when the corresponding sub-policies are applied in ${\tau}$ -static manner. In other words, a SM top-level policy is defined by $\phi=(d_{0},d_{1},\cdots)$ , where decision $d_{j}$ specifies a ‘ ${\tau}$ -static blocking probability’ to be chosen when number of ${\tau}$ -customers equals $j$ . Towards this we implicitly require the existence of at least one sub-policy, that achieves the given value of ‘ ${\tau}$ -static blocking probability’, which is any value between the system specified limits ${\underline{d}}:={\underline{P_{B}}}$ (minimum possible blocking probability) and ${\bar{d}}:={\overline{P_{B}}}$ (the maximum possible blocking probability). This for example, is achieved by CD- $(p,K)$ /LPS- $(p,K)$ policies mentioned in Section 2, when one considers all possible values of $\{(p,K)\}$ (see [3, 4] for more details). In the rest of the paper, we refer to the top-level policies simply as policies for brevity.

Limit Achievable region Our focus from here on will be the dynamic achievable region $\mathcal{A}^{\infty}$ of performance vectors under the SFJ limit. Recall that for tolerant class, the limit is an SDSR-M/M/1 queue. The $\epsilon$ -limit can be seen as a mixture model made up of many lossy systems, each described by their ${\tau}$ -static blocking probabilities, and mixed independently according the stationary distribution of the limit SDSR-M/M/1 queue. Thus, we define the limit achievable region as follows:

[TABLE]

Note that $\mathcal{A}^{\infty}$ is the set of limiting performance vectors under SM policies. In this sense, one may view $\mathcal{A}^{\infty}$ as the limit of achievable region of our multi-class system as ${\mu_{\epsilon}}\rightarrow\infty.$

For simplicity of notations we avoid the super-script $\infty$ when the discussion is clearly about the limit system. At times we also drop $\phi$ , the SM policy, when there is no ambiguity.

A Numerical Example: To visualize the limit achievable region, we consider a system with a top-level policy parametrized by ( $(p_{1},p_{2},L,K)$ ). In this system, the CD- $(p_{1},K)$ policy is employed when the ${\tau}$ -occupancy is less than $L,$ and the CD- $(p_{2},K)$ policy is employed when the ${\tau}$ -occupancy is greater than or equal to $L.$ The tolerant customers are served serially with total capacity of all the leftover servers. Using the Erlang-B formula, the two ${\tau}$ -static blocking probabilities of ${\epsilon}$ -customers equal

[TABLE]

where ${P_{B}}_{1}$ is the ${\tau}$ -static blocking probability when ${\tau}$ -occupancy is less than $L$ , while ${P_{B}}_{2}$ is the ${\tau}$ -static blocking probability for the rest of the tolerant states. The performance of such a system at limit can be obtained using the results of Theorems 2-3. We set $K=5$ , $\rho_{\epsilon}=0.4$ , $\lambda_{\tau}=4$ and $\mu_{\tau}=8$ , generate the three parameters $(p_{1},p_{2},L)$ randomly. The scatter plot of the corresponding values of $E^{\infty}[N]$ and $P^{\infty}_{B}$ is shown in Figure 3. The resulting figure is a part of the limit achievable region. As seen from the figure, the achievable region is a non-zero measure set; thus we need the ‘efficient’ Pareto frontier. Further, the plot indicates that the achievable region is bounded. We will now address the Pareto frontier associated with this system.

5 Limit Pareto frontier

The Pareto frontier is the efficient sub-region of an achievable region which consists of dominating performance vectors. A pair $(P_{B,\phi^{*}},E_{\phi^{*}}[N])$ (produced by a policy $\phi^{*}$ ) is on Pareto frontier of the limit system, if there exists no other SM policy $\phi$ that achieves a better performance pair $(P_{B,\phi},E_{\phi}[N])$ (in the limit system), i.e., $P_{B,\phi}\leq P_{B,\phi^{*}}\mbox{ and }E_{\phi}[N]\leq E_{\phi^{*}}[N],$ one of the inequalities being strict.

The Pareto frontier of the limit system is obtained by solving an appropriate set of parametrized optimization problems. Prior to that, we discuss the limit performance, under any given SM policy. Invoking Theorems 2 and 3, under any $\phi$ ,

[TABLE]

5.1 Pareto-complete family

We now derive a family of Pareto-complete policies, i.e., a parametrized family of policies that span the entire Pareto-frontier of our system. One can obtain all the points on the Pareto frontier by considering the following parametrized (by $C$ ) constrained optimization problems (with $h_{i}^{\phi}$ defined in (11)). l

[TABLE]

Recall ${\underline{d}}$ , ${\bar{d}}$ respectively represent the best and worst sub-policy (with respect to ${\epsilon}$ -customers), in that these represent the minimum and maximum possible blocking probabilities. Define:

[TABLE]

The terms $({\bar{\rho}},{\underline{\rho}})$ represent the (worst and best) load factor of the ${\tau}$ -customers in the limit system, when ${\epsilon}$ -customers are scheduled respectively with the best (blocking probability ${\underline{d}}$ ) and worst (blocking probability ${\bar{d}}$ ) sub-policies, in ${\tau}$ -static manner.

Suppose that the constraint $C$ on the expected ${\tau}$ -number satisfies $C\geq{\bar{\rho}}/(1-{\bar{\rho}})$ (observe ${\bar{\rho}}/(1-{\bar{\rho}})$ is the expected number in M/M/1 queue with maximum load factor ${\bar{\rho}}$ ); then the problem (12) becomes an unconstrained problem and the optimal policy clearly equals $\phi^{*}=({\underline{d}},{\underline{d}},\cdots)$ . We show that the optimal policy for any given $C<{\bar{\rho}}/(1-{\bar{\rho}})$ , is monotone (but not strictly monotone) in $\tau$ -state and further derive its closed form expression (proof in Appendix E):

Theorem 6.

The policy $\phi^{*}=\{d_{0}^{*},d_{1}^{*},\cdots\}$ that optimizes the problem defined in (12) is monotone and is given by:

[TABLE]

which is parametrized by two parameters $(L^{*},d^{*})$ . The expressions for $(L^{*},d^{*})$ are given by (see (13)):

[TABLE]

In the above $L^{*}\geq 1$ , is set to $\infty$ when the inequality defining the $\sup$ is satisfied for all $i$ (i.e., when $C\geq{\bar{\rho}}/(1-{\bar{\rho}})$ ).

An immediate consequence of the above theorem is the following:

Corollary 1 (Pareto-Complete family).

The family of schedulers given by (14), parametrized by $(L^{*},d^{*})$ with $1\leq L^{*}\leq\infty$ and ${\underline{d}}\leq d^{*}\leq{\bar{d}}$ , is Pareto-complete.

The policies in this family choose the ‘worst’ $\epsilon$ -sub-policy (i.e., with $d={\bar{d}}$ ) when the $\tau$ -number is greater than or equal to $L+1$ , choose a sub-policy with intermediate blocking $d$ when $\tau$ -number equals $L$ and choose the ‘best’ sub-policy (i.e., with $d={\underline{d}}$ ) for the rest (see (14)). One can easily compute the performance under these policies, as below (see 13):

[TABLE]

Thus *we derived Pareto complete family as well as the performance under this family, which can readily be used for any relevant optimization problem. *

Numerical example: We continue with the numerical example of Figure 3. For this example, one can easily compute that ${\bar{\rho}}=0.8134$ (no admission control on eager class, i.e., with $p_{i}=1$ for all $i$ ) and ${\underline{\rho}}=0.5$ (eager class is completely blocked with ${\bar{d}}=1$ and hence ${\underline{\rho}}=\lambda_{\tau}/\mu_{\tau}$ ), when the system can at maximum serve 5 eager customers in parallel. By substituting these values into (5.1), one can obtain the Pareto frontier. The circles in the figure represent this Pareto frontier, and are obtained by varying $(L,d)$ appropriately. It is clear from the figure that the derived set of points are indeed dominating and are on the Pareto frontier.

5.2 Monte Carlo based case study in pre-limit

We consider an example case-study with CD- $(p,K)$ sub-policies of Subsection 2.3. Specifically, for fixed $K,$ we consider top-level policies that perform CD- $(1,K)$ when the ${\tau}$ -occupancy is less than certain $L$ (with $L\geq 1$ ) and perform CD- $(0,K)$ (thus blocking all ${\epsilon}$ -jobs) when the ${\tau}$ -occupancy is greater than or equal to $L.$ In view of Theorem 6, by stepping over $L$ as above666 Here again, ${\underline{d}}$ (respectively ${\bar{d}}$ ) equals the blocking probability without eager admission control (respectively if eager class is admitted only when ${\tau}$ -queue is empty)., we sample performance vectors from the limit Pareto-frontier of the system.

It is very complicated to obtain an exact analysis of this heterogeneous system. However by Theorems 2-3, one can obtain an approximate analysis for this system. In this section, we validate these approximations against Monte Carlo (MC) simulations of the actual system. Importantly, the Monte Carlo simulations do not even drop ${\epsilon}$ -customers at ${\tau}$ -transitions as required by A.1.

In all the case studies presented in this section, we consider exponentially distributed job sizes and Poisson arrivals for both the classes. The parameters used for a particular case study are described in the corresponding figure itself.

Our first case studies are presented in Figures 5 and 5. We plot two performance measures, mean number of tolerant jobs, and blocking probability of eager jobs, versus $\mu_{\epsilon}$ for two different tolerant load factors; Figure 5 corresponds to a light tolerant load, while Figure 5 corresponds to a moderate tolerant load. (The system parameters used are mentioned in the figures directly.) Note that as expected, the simulated system performance metrics approach their SFJ approximations as ${\mu_{\epsilon}}$ increases. Interestingly, our approximations tend to under-estimate both metrics (perhaps because the partial fluid limit we consider ‘washes away’ the stochasticity of the eager workload). Moreover, note that in the case of moderate tolerant load, the approximation is quite accurate for all values of ${\mu_{\epsilon}}\geq 1$ ; the normalized difference between the theoretical approximation and the corresponding MC estimate is within $6\%$ for ${\mu_{\epsilon}}\geq 1$ with $L=17$ and for ${\mu_{\epsilon}}\geq 5$ with $L=1,$ and the error is negligible for higher values of ${\mu_{\epsilon}}$ (see Figure 5). On the other hand, with light tolerant traffic ( $\rho_{\tau}=0.0862$ ) as in Figure 5, the normalized difference is within $6\%$ only for ${\mu_{\epsilon}}\geq 5$ for $L=17$ and only for ${\mu_{\epsilon}}\geq 20$ for $L=1$ . In general, we observe that the approximation error is smaller when the system operates closer to its stability limit. But even for the case with light traffic, the normalized difference is within $10\%$ in most cases once ${\mu_{\epsilon}}\geq 10.$

Next, we consider the Pareto frontier of performance vectors, and compare our SFJ approximation with the frontier obtained via MC simulations; see Figure 7 . We observe that our analytical results corresponding to the SFJ limit provide a very accurate approximation of the Pareto frontier, even for $\mu_{\epsilon}$ as small as 1, when $\mu_{\tau}=0.54$ . We reiterate that the theory approximates the system performance well even when the system does not ‘flush’ ${\epsilon}$ -customers at ${\tau}$ -transitions.

Another example is plotted in Figure 7, where ${\tau}$ -static policies of [4] are considered along with dynamic policies. We again observe a good approximation between the theory and MC estimates for dynamic policies. Interestingly, the approximation error is bigger in the static case. One possible explanation for this is the following. It is clear that the approximation error gets smaller as the ${\epsilon}$ -load factor reduces, under ${\tau}$ -static policies. Under Pareto optimal family of schedulers, the ${\epsilon}$ -load equals 0 for all ${\tau}$ -states greater than $L$ . Thus we see the approximation is almost zero towards the right of the two figures (as $P_{B}$ gets smaller, $L$ gets smaller). We also observe that the dynamic policies perform far superior than ${\tau}$ -static policies.

6 Concluding remarks

In this paper, we analyse a multi-class, single server queueing system with an eager (lossy) class and a tolerant (queueing) class, under dynamic scheduling. While the inter-dependence between the service processes of the two classes makes an exact analysis of this system difficult, we obtain tractable performance approximations under a certain (partial) fluid scaling regime. A key feature of our approximations, proved to be accurate under the fluid limit, is a pseudo-conservation law: the approximate performance of both classes is expressed in terms of the standalone blocking probabilities of the eager schedulers, which are themselves easy to compute in several cases. Further, the accuracy of our approximations in the pre-limit is validated via Monte Carlo simulations.

Finally, we focus on the achievable region of the limiting performance vectors for our system. Remarkably, we are able to obtain an explicit family of Pareto-optimal policies (these resemble threshold policies).

There are natural extensions of our results to models where the eager class exhibits limited patience; for example, models with balking and/or reneging. For example, the system might include (limited) waiting room for eager customers. Alternatively, eager customers might respond to the resources allocated based on their patience levels: a) an ${\epsilon}$ -customer may not enter the system depending upon the ${\epsilon}$ -number already in system according to some probabilistic rule, as in balking models; or b) may leave the system after waiting for a random ‘patience time’, as in reneging models. At their core, our proofs require the property that the occupancy process of the eager class gets time-scaled (fast-forwarded) with increasing ${\mu_{\epsilon}}$ under the SFJ scaling. The analyses apply to the above mentioned generalizations so long as this scaling property holds. For example, in the case of reneging, we would require that the patience time distribution is scaled suitably with ${\mu_{\epsilon}},$ (as is discussed in the context of static scheduling in [4]).

This work also motivates extensions in other directions. One interesting extension would be to the multi-server setting, where the tolerant class is no longer work conserving. Another promising direction is to consider static/dynamic pricing for such heterogeneous service systems. Finally, specializing our models to particular application scenarios, including supermarkets, cognitive radio, and cloud computing environments, would be of independent interest.

Appendix A Sample path coupling and SFJ limit

In this appendix, we detail the sample path coupling that results from the SFJ limit, which is central to our arguments.

We consider a family of queuing systems, parametrized by ${\mu_{\epsilon}}\geq 1$ (the service rate of the ${\epsilon}$ -class). Under SFJ limit, the eager service rate ${\mu_{\epsilon}}\to\infty$ , and the eager arrival rate $\lambda_{\epsilon}\rightarrow\infty,$ while maintaining the eager load $\rho_{\epsilon}=\lambda_{\epsilon}/\mu_{\epsilon}$ to a constant value. Specifically, we scale the job size distribution of the eager class as, $B^{\mu_{\epsilon}}_{\epsilon}\stackrel{{\scriptstyle d}}{{=}}B^{1}_{\epsilon}/\mu_{{\epsilon}}$ , where $B^{\mu_{\epsilon}}_{\epsilon}$ denotes a generic eager job size at scale $\mu_{\epsilon}$ and $\stackrel{{\scriptstyle d}}{{=}}$ is equality in distribution.

Consider now this family of queueing systems, immediately following an arrival/departure from the tolerant queue. Specifically, suppose that the arrival/departure resulted in $j$ jobs remaining in the tolerant queue. We couple the sample paths across different values of the scale parameter ${\mu_{\epsilon}}$ as follows. Let $A_{{\epsilon},k}^{\mu_{\epsilon}}$ and $B_{{\epsilon},k}^{\mu_{\epsilon}}$ are respectively the $k$ th inter-arrival time and $k$ th job size of the eager class under sub-policy $j$ (for notational convenience, the dependence on $j$ is suppressed here). We relate these quantities sample path wise to the $\mu_{\epsilon}=1$ system as below:

[TABLE]

Note that this coupling is consistent with the SFJ scaling: $A_{{\epsilon},k}^{\mu_{\epsilon}}$ is indeed exponentially distributed with mean $1/\rho_{{\epsilon}}{\mu_{\epsilon}}$ given that $A_{{\epsilon},k}^{1}$ is exponentially distributed with mean $1/\rho_{{\epsilon}}.$ Similarly, note that we have $B_{{\epsilon},k}^{\mu_{\epsilon}}\stackrel{{\scriptstyle d}}{{=}}B_{{\epsilon}}^{\mu_{\epsilon}}\stackrel{{\scriptstyle d}}{{=}}\frac{B_{{\epsilon}}^{1}}{\mu_{\epsilon}},$ as required.

An immediate consequence of (16) is that the occupancy process of the eager class at scale ${\mu_{\epsilon}}$ can be viewed as a ${\mu_{\epsilon}}$ -time-scaled (or fast-forwarded) version of occupancy process at scale 1; see [4] for more details. In particular, let $\mathcal{B}_{j,k}^{\mu_{\epsilon}}$ and $\mathcal{S}_{j,k}^{\mu_{\epsilon}}$ denote the length of the $k$ th ${\epsilon}$ -busy cycle (the interval between the start of two successive ${\epsilon}$ -busy periods), and total service available for the ${\tau}$ -class during the $k$ th ${\epsilon}$ -busy cycle, respectively, when tolerant queue occupancy equals $j.$ It now follows that

[TABLE]

We conclude by describing another consequence of the sample path coupling (16) described above. Let $N_{A}^{\mu_{\epsilon}}([a,b])$ represent the number of ${\epsilon}$ -arrivals in time interval $[a,b]$ and let $N_{B_{j}}^{\mu_{\epsilon}}([a,b])$ represent the number of drops (or the ${\epsilon}$ -customers that exit the system without service) in the interval $[a,b],$ when sub-policy $j$ is used in a ${\tau}$ -static manner, at scale $\mu_{\epsilon}$ . More compactly, let $N_{B_{j}}^{{\mu_{\epsilon}}}(t)$ , $N_{A}^{\mu_{\epsilon}}(t)$ represent $N_{B_{j}}^{{\mu_{\epsilon}}}([0,t])$ and $N_{A}^{\mu_{\epsilon}}([0,t]),$ respectively. Then, by the way of construction:

[TABLE]

and a similar equality holds for subsequent ${\epsilon}$ -busy cycles as well. These kind of consequences play a central role in deriving the results of the next two appendices.

Appendix B Tolerant Performance

In this appendix, we provide proofs of the results stated in Section 3, corresponding to the performance evaluation of the tolerant class.

This appendix is organized as follows. In Section B.1, we prove a technical result that is used to prove Lemma 1. The proofs of Lemmas 1 and 2 are provided in Sections B.2 and B.3, respectively. The proofs of Theorems 5, 1 and 2 are provided in Sections B.4, B.5, and B.6, respectively.

B.1 Bounding the second moment of ${\epsilon}$ -busy cycles

To prove Lemma 1, we need the following technical lemma, which states that second moment of the the eager busy cycles $\mathcal{B}_{j},$ corresponding to ${\tau}$ -occupancy $j$ and scale ${\mu_{\epsilon}}=1,$ are uniformly bounded from above over sub-policies $j.$

Lemma 3.

*Under Assumptions ***A.**1-3, Assumption A.4 is true, i.e., there exists a constant $\mathcal{M}$ such that $E\left[\mathcal{B}_{j}^{2}\right]\leq\mathcal{M}$ for any ${\epsilon}$ sub-policy $j.$

Proof.

Note that busy cycle $\mathcal{B}_{j}\stackrel{{\scriptstyle d}}{{=}}O_{j}+I_{j},$ where $O_{j}$ denotes the busy period, and $I_{j}$ the idle period between two successive busy periods. Moreover, $I_{j}$ is exponentially distributed with mean $1/\lambda_{\epsilon}.$ Thus, to prove the statement of the lemma, it suffices to prove that $E\left[O_{j}^{2}\right]$ is bounded from above uniformly over sub-policies $j.$ This is proved as follows. The busy period $O_{j}$ under any sub-policy $j$ can be upper bounded by the busy period of an $M/G/\infty$ system with arrival rate $\lambda_{\epsilon}$ and job size $X:\stackrel{{\scriptstyle d}}{{=}}B_{\epsilon}/c_{\min}.$ Indeed, note that since any admitted eager job of size $B_{\epsilon}$ must be served at a minimum rate of $c_{\min},$ $X=B_{\epsilon}/c_{\min}$ is an upper bound on the residence time of the job in the system. The statement of the lemma now follows from Lemma 4 below, which provides an upper bound on the second moment of an $M/G/\infty$ busy period. ∎

Lemma 4.

Consider an $M/G/\infty$ queue with arrival rate $\lambda,$ with a generic job size denoted by $X$ . Let $O$ denote a generic busy period of this system. Then

[TABLE]

where $\theta:=\lambda E\left[X\right].$

Proof.

Given a non-negative random variable $Y$ having finite expectation, let $Y^{*}$ denote its forward excess, i.e., $Y^{*}$ is a non-negative random variable satisfying

[TABLE]

Note that $E\left[Y^{*}\right]=\frac{E\left[Y^{2}\right]}{2E\left[Y\right]}.$

Now, returning to the $M/G/\infty$ queue, let $K:=1-e^{-\theta}.$ Define a non-negative random variable $U$ , which is distributed as below:

[TABLE]

Finally, let $M$ denote a geometric random variable such that

[TABLE]

The following representation for the forward excess of the $M/G/\infty$ busy period was established by Makowski [15]:

[TABLE]

where $\{U_{i}\}_{i\geq 1}$ is an i.i.d. sequence of random variables distributed as $U$ independent of $M.$

From (20), we have

[TABLE]

which yields

[TABLE]

Here, we have used $E\left[O\right]=\frac{K}{\lambda(1-K)},$ which is also proved in [15].

We upper bound $E\left[U\right]$ as follows. Using the Markov inequality, $P\left(X^{*}>x\right)\leq\frac{E\left[X^{2}\right]}{xE\left[X\right]}.$ It then follows that $\theta P\left(X^{*}>x\right)\leq 2$ for $x\geq\beta:=\frac{\lambda E\left[X^{2}\right]}{2}.$ Now,

[TABLE]

The bounding in $(a)$ uses the inequality $e^{x}-1\leq\gamma x$ for $x\in[0,2],$ with $\gamma:=\frac{e^{2}-1}{2}$ (by convexity of $e^{x}$ ). Finally, combining (21) and (22) gives us the statement of the lemma. ∎

B.2 Proof of Lemma 1

Recall that, when ${\tau}$ -occupancy equals $j$ (for any given $j$ ), the backward transition probability of the tolerant birth-death chain (at scale ${\mu_{\epsilon}}$ ) is given by $q_{j}^{\mu_{\epsilon}}$ . We aim to show the convergence of these transition probabilities, uniformly over $j$ , under the SFJ limit. The idea is to develop (for each $j$ ) a lower and an upper bound on $q_{j}^{\mu_{\epsilon}}$ which uniformly converge to the same constant given by Equation (5).

Consider the evolution of the tolerant queue, starting from an instant when the ${\tau}$ occupancy changed, via an arrival or a departure, to $j.$ Let $A_{\tau}$ denote the time until the next ${\tau}$ arrival, let $\mathcal{B}_{j,k}^{{\mu_{\epsilon}}}$ denote the $k$ th ${\epsilon}$ -busy cycle and let $\mathcal{S}_{j,k}^{{\mu_{\epsilon}}}$ denote the total amount of service available for the tolerant class during the $k$ th ${\epsilon}$ -busy cycle, at scale ${\mu_{\epsilon}}$

Lower bound: Consider any eager sub-policy $j$ . Define $N:=\min\{n\geq 1\ |\ \sum_{k=1}^{n}\mathcal{B}_{j,k}^{{\mu_{\epsilon}}}>A_{\tau}\}.$ Note that $N$ is the index of the ${\epsilon}$ -busy cycle in which the next ${\tau}$ arrival occurs. Since $A_{\tau}$ is exponentially distributed, $N$ is a geometric random variable with parameter $P\left(\mathcal{B}_{j}^{{\mu_{\epsilon}}}>A_{\tau}\right).$ Further, define $\bar{q}_{j}:=P\left(\sum_{k=1}^{N-1}\mathcal{S}_{j,k}^{{\mu_{\epsilon}}}>B_{{\tau}}\right).$ Note that $\sum_{k=1}^{N-1}\mathcal{S}_{j,k}^{{\mu_{\epsilon}}},$ which is the service received by the ${\tau}$ queue in the first $N-1$ ${\epsilon}$ -busy cycles, is a lower bound on the total service received by the ${\tau}$ queue until the next ${\tau}$ -arrival. Thus, the event $\sum_{k=1}^{N-1}\mathcal{S}_{j,k}^{{\mu_{\epsilon}}}>B_{{\tau}}$ implies that a ${\tau}$ - departure would precede the next ${\tau}$ -arrival; this yields the bound $q^{{\mu_{\epsilon}}}_{j}\geq\bar{q}_{j}.$ Moreover, the lower bound $\bar{q}_{j}$ , on $q_{j}^{\mu_{\epsilon}}$ can be represented as follows:777Let ${\bar{q}}_{j}:=E({\cal E})$ , with indicator of the event ${\cal E}:=1_{\sum_{k=1}^{N-1}\mathcal{S}_{j,k}^{{\mu_{\epsilon}}}>B_{{\tau}}}$ . Condition on the events of first ${\epsilon}$ -busy cycle:

•

if $\mathcal{S}_{j,1}^{{\mu_{\epsilon}}}>B_{\tau}$ and $\ \mathcal{B}_{j,1}^{{\mu_{\epsilon}}}<A_{\tau}$ then ${\cal E}=1$ ; otherwise

•

if $\ \mathcal{B}_{j,1}^{{\mu_{\epsilon}}}>A_{\tau}$ then ${\cal E}=0$ ; or

•

if $\ \mathcal{B}_{j,1}^{{\mu_{\epsilon}}}<A_{\tau}$ and $\mathcal{S}_{j,1}^{{\mu_{\epsilon}}}<B_{\tau}$ , then the conditional probability of ${\cal E}$ equals its unconditional probability by the memorylessness of $A_{\tau}$ and $B_{\tau}$ .

And thus,

$\bar{q}_{j}=P\left(\mathcal{S}^{{\mu_{\epsilon}}}_{j,1}>B_{\tau},\ \mathcal{B}^{{\mu_{\epsilon}}}_{j,1}<A_{\tau}\right)+\bar{q}_{j}P(\mathcal{B}_{j,1}^{{\mu_{\epsilon}}}<A_{\tau},\ \mathcal{S}_{j,1}^{{\mu_{\epsilon}}}<B_{\tau}),$

which implies (23) given that

$1-P(\mathcal{B}_{j,1}^{{\mu_{\epsilon}}}<A_{\tau},\ \mathcal{S}_{j,1}^{{\mu_{\epsilon}}}<B_{\tau})=P\left(\mathcal{S}^{{\mu_{\epsilon}}}_{j,1}>B_{\tau},\ \mathcal{B}^{{\mu_{\epsilon}}}_{j,1}<A_{\tau}\right)+P\left(\mathcal{B}^{{\mu_{\epsilon}}}_{j,1}>A_{\tau}\right).$

[TABLE]

By Lemma 5 below, we have uniform convergence (here, $\varpi({\mu_{\epsilon}})\to 0$ as ${\mu_{\epsilon}}\to\infty$ u.f. in $j$ ):

[TABLE]

where $\mathcal{B}_{j}$ is a typical ${\epsilon}$ -busy cycle and $\mathcal{S}_{j}$ is the (typical) unused service left behind by the eager class in one ${\epsilon}$ -busy cycle, both with ${\mu_{\epsilon}}=1$ and under sub-policy $j$ . Using the above and because888When renewal reward theorem is applied with renewal cycles as ${\epsilon}$ -busy cycles and with rewards being the amount of service available to the ${\tau}$ -class in each busy cycle. $\nu_{j}=\frac{E\left[\mathcal{S}_{j}\right]}{E\left[\mathcal{B}_{j}\right]}$ , we get from equation (23) that the lower bound on $q_{j}^{\mu_{\epsilon}}$ ,

[TABLE]

Upper bound: Next, we will prove the u.f. convergence of an upper bound on the transition probabilities $q^{{\mu_{\epsilon}}}_{j}$ , to the same limit. Since $\mathcal{B}^{\mu_{\epsilon}}_{j}\geq\mathcal{S}^{\mu_{\epsilon}}_{j}$ , one can upper bound $q_{j}^{\mu_{\epsilon}}$ as follows:

[TABLE]

In view of Lemma 6 below, we have as ${\mu_{\epsilon}}\to\infty$ ,

[TABLE]

Therefore the upper bound on $q_{j}^{\mu_{\epsilon}}$ ,

[TABLE]

This proves that the lower and upper bound converge to the same limit (for any given $j$ ), uniformly across all $j$ . This proves the statement of the lemma. ∎

Lemma 5.

*We have:

i) $|{\mu_{\epsilon}}P\left(\mathcal{B}^{{\mu_{\epsilon}}}_{j}>A_{\tau}\right)-\lambda_{\tau}E\left[\mathcal{B}_{j}\right]|\leq\frac{\lambda_{{\tau}}^{2}E\left[\mathcal{B}^{2}_{j}\right]}{2{\mu_{\epsilon}}},$ and

ii) $|{\mu_{\epsilon}}P\left(\mathcal{S}^{{\mu_{\epsilon}}}_{j}>B_{\tau},\ \mathcal{B}^{{\mu_{\epsilon}}}_{j}<A_{\tau}\right)-\mu_{\tau}E\left[\mathcal{S}_{j}\right]|\leq\frac{(\lambda_{\tau}+\mu_{\tau})^{2}E\left[\mathcal{B}^{2}_{j}\right]}{{\mu_{\epsilon}}}.$

That is, there exists a function $\varpi({\mu_{\epsilon}})$ (which is independent of $j,$ and is finite by Lemma 3), such that $\varpi({\mu_{\epsilon}})\to 0$ as ${\mu_{\epsilon}}\to\infty$ and for each $j$ ,*

[TABLE]

Proof.

Proof of $(i)$ : Note that, for any random variable $Y,$ the moment generating function (MGF) is given by: $M_{Y}(s):=E\left[e^{sY}\right].$ We use the following bounds in our proof repeatedly. Suppose that the random variable $X$ is exponentially distributed with mean $1/\gamma,$ and the random $Y$ is non-negative and independent of $X.$ Then

[TABLE]

The equality above follows by conditioning on $Y,$ and the inequalities are a consequence of the following bounds on the MGF: for $s\geq 0,$

[TABLE]

Now, since $A_{\tau}$ is exponentially distributed and because $\mathcal{B}_{j}^{\mu_{\epsilon}}\stackrel{{\scriptstyle d}}{{=}}\mathcal{B}_{j}^{1}/\mu_{{\epsilon}}=\mathcal{B}_{j}/\mu_{{\epsilon}}$ (see (17)), we have, using (24),

[TABLE]

Using this inequality, and invoking (24) again,

[TABLE]

This completes the proof of Part $(i)$ .

Proof of (ii): Again, since $A_{\tau}$ is exponentially distributed, using (24), we have

[TABLE]

Since, $\mathcal{B}^{\mu_{\epsilon}}_{j}\geq\mathcal{S}^{\mu_{\epsilon}}_{j}$ , we will have

[TABLE]

Because $A_{\tau}$ and $B_{\tau}$ are exponentially distributed, the second term of the above inequality becomes (using the lower and upper bounds on $M_{\mathcal{B}_{j}}$ as before):

[TABLE]

The result follows from (25), (26), and (27):

[TABLE]

This completes the proof of lemma. ∎

Lemma 6.

The probability $P\left(\mathcal{B}^{{\mu_{\epsilon}}}_{j}>B_{\tau}\ |\ \mathcal{B}^{{\mu_{\epsilon}}}_{j}>A_{\tau}\right)$ converges to 0 u.f. over $j$ , under the SFJ limit.

Proof.

Using arguments similar to those in the proof of Lemma 5 (see equation (24)), we can prove that

[TABLE]

From this and (27) we will have,

[TABLE]

The statement of the lemma now follows by the uniform bound on $E\left[\mathcal{B}^{2}_{j}\right]$ (Lemma 3). ∎

B.3 Proof of Lemma 2

By Lemma 7 $(iii)$ below, for any $\varepsilon>0$ , there exists a $\bar{\mu}>0$ such that for ${\mu_{\epsilon}}>\bar{\mu},$

[TABLE]

where the last inequality is shown in the proof of Lemma 7 $(iv)$ . Thus by standard text book arguments (e.g., [14]) we have the result. ∎

Lemma 7.

*Assume the hypothesis of Lemma 1. Then we have following under SFJ limit:

(i) $\frac{1}{p_{i}^{\mu_{\epsilon}}}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\frac{1}{p_{i}^{\infty}}$ , u.f. over $i$ ,

(ii) $\frac{p_{i}^{\mu_{\epsilon}}}{q_{i}^{\mu_{\epsilon}}}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\frac{p_{i}^{\infty}}{q_{i}^{\infty}}$ , u.f. over $i$ ,

(iii) $\sum_{k=1}^{\infty}\frac{p_{0}^{\mu_{\epsilon}}p_{1}^{\mu_{\epsilon}}\cdots p_{k-1}^{\mu_{\epsilon}}}{q_{1}^{\mu_{\epsilon}}q_{2}^{\mu_{\epsilon}}\cdots q_{k}^{\mu_{\epsilon}}}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\sum_{k=1}^{\infty}\frac{p_{0}^{\infty}p_{1}^{\infty}\cdots p_{k-1}^{\infty}}{q_{1}^{\infty}q_{2}^{\infty}\cdots q_{k}^{\infty}}$ .

(iv) $\left(\sum_{k=1}^{\infty}\frac{p_{0}^{\mu_{\epsilon}}p_{1}^{\mu_{\epsilon}}\cdots p_{k-1}^{\mu_{\epsilon}}}{q_{1}^{\mu_{\epsilon}}q_{2}^{\mu_{\epsilon}}\cdots q_{k}^{\mu_{\epsilon}}}\right)^{-1}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\left(\sum_{k=1}^{\infty}\frac{p_{0}^{\infty}p_{1}^{\infty}\cdots p_{k-1}^{\infty}}{q_{1}^{\infty}q_{2}^{\infty}\cdots q_{k}^{\infty}}\right)^{-1}$ .*

Proof.

By Lemma 1,

[TABLE]

uniformly $(u.f.)$ over all the sub-policies $i$ . Also, note that $0<p_{i}^{\mu_{\epsilon}},q_{i}^{\mu_{\epsilon}}<1$ .

Therefore, for every $\varepsilon>0,\ \exists\ \bar{\mu}$ such that, for all sub-policies $i$ , whenever $\mu_{\epsilon}>\bar{\mu}$ , we have:

[TABLE]

Proof of (i): Using equation (28), for every $\varepsilon>0$ , there exists a $\bar{\mu}$ such that,

[TABLE]

Since, ${\tau}$ -system is stable for all $i$ , i.e., $\frac{\lambda_{\tau}}{\nu_{i}\mu_{\tau}}<1-\delta_{\tau}$ for all $i$ (Assumption B.4), the following is true

[TABLE]

Using equation (30) in (29) we get, for every $\varepsilon>0$ , there exists a $\bar{\mu}$ such that:

[TABLE]

This implies, $\frac{1}{q_{i}^{\mu_{\epsilon}}}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\frac{1}{q_{i}^{\infty}}$ uniformly over sub-policies $i$ .

Proof of (ii): Using equation (28), for every $\varepsilon>0$ , there exists a $\bar{\mu}$ such that, for all ${\mu_{\epsilon}}\geq\bar{\mu}$ ,

[TABLE]

In above, the last inequality followed from equation (30). This implies, $\frac{p_{i}^{\mu_{\epsilon}}}{q_{i}^{\mu_{\epsilon}}}\stackrel{{\scriptstyle{\mu_{\epsilon}}\to\infty}}{{\longrightarrow}}\frac{p_{i}^{\infty}}{q_{i}^{\infty}}$ $u.f.$ over $i$ .

Proof of (iii): From part $(ii)$ and assumption B.4, for every $\varepsilon>0$ , there exists a $\bar{\mu}$ such that, for each ${\mu_{\epsilon}}>\bar{\mu}$

[TABLE]

By appropriate choice of $\varepsilon>0$ , such that $\varepsilon+1-\delta_{\tau}<1$ , we have (note that $p_{0}^{\mu_{\epsilon}}=1$ ),

[TABLE]

Consider the following summation,

[TABLE]

Therefore, by dominated convergence theorem (DCT) for series (note each of the terms inside the series converge by part $(ii)$ ), we have,

[TABLE]

Proof of (iv): It suffices to show that the limit in part $(iii)$ lies in $\left(0,\infty\right).$ The limit is clearly positive as it is the summation of strictly positive terms. The finiteness of the limit follows from the DCT as in the proof of part $(iii).$ ∎

B.4 Proof of Theorem 5

For any $i\geq 0$ , $p_{i}^{\mu_{\epsilon}}=P(A_{\tau}\leq\Upsilon_{i}^{\mu_{\epsilon}}(B_{\tau}))$ denotes the probability that ${\tau}$ -arrival is before ${\tau}$ -departure in the corresponding ${\mu_{\epsilon}}$ -system, when ${\tau}$ occupancy equals $i$ .

From equation (7), the stationary distribution of the tolerant embedded BD chain is given by,

[TABLE]

By Lemma 7,

[TABLE]

and

[TABLE]

Thus as ${\mu_{\epsilon}}\to\infty$ , the stationary distribution of the embedded tolerant BD chain converges to the stationary distribution of the embedded chain corresponding to the limit SDSR-M/M/1 queue. Thus we have weak convergence of the stationary distributions. ∎

B.5 Proof of Theorem 1

The statement of Theorem 1 follows directly from Theorem 5 invoking the following lemma, which relates the steady state distribution of a continuous time queue with the stationary distribution of the discrete-time embedded Markov chain obtained by sampling the queue-occupancy at arrival and departure epochs.

Lemma 8.

Consider a stable queueing system with Poisson job arrivals and no simultaneous departures (i.e., jobs depart one at a time with probability 1), such that the time-average distribution of queue occupancy $\pi=\{\pi_{i}\}_{i\geq 0}$ is well defined, i.e.,

[TABLE]

where $X(t)$ denote the queue occupancy at time $t.$ Let $\tilde{\pi}=\{\tilde{\pi}_{i}\}_{i\geq 1}$ denote the (discrete-time) time-average distribution of the queue occupancy sampled just following arrival/departure epochs. Then

[TABLE]

Proof: Let $X_{n}$ denote the queue occupancy just following the $n$ th arrival/departure epoch. Let $X^{A}_{n}$ (respectively, $X^{D}_{n}$ ) denote the queue occupancy just before the $n$ th arrival (respectively, just following the $n$ th departure). By PASTA,

[TABLE]

Moreover, since the time-average seen by arrivals matches the time average seen by departures (see, for example, [12, Chapter 26]),

[TABLE]

Define, for $n\geq 1,$

[TABLE]

For $i\geq 1,$

[TABLE]

Also,

[TABLE]

∎

B.6 Proof of Theorem 2

Note that, by Lemma 7, for every $\varepsilon>0$ , there exists a $\bar{\mu}$ such that, for all sub-policies $i$ and ${\mu_{\epsilon}}>\bar{\mu}$ ,

[TABLE]

As, for all $i\geq 0$ , $\pi_{i}^{\mu_{\epsilon}}\leq 2\tilde{\pi}_{i}^{\mu_{\epsilon}}$ , we will have, for all ${\mu_{\epsilon}}>\bar{\mu}$ and $i$ ,

[TABLE]

Note that the following summation is finite, as:

[TABLE]

By the above upper bound, DCT can be applied (note by Theorem 1, $\pi_{i}^{\mu_{\epsilon}}\to\pi_{i}$ for all $i$ ), and we have:

[TABLE]

This completes the proof.∎

Appendix C Eager Performance

In this appendix we provide the proofs related to eager performance. These proofs depend extensively upon the coupling details provided in Appendix A.

Proof of Theorem 3: Let $N_{B}^{\mu_{\epsilon}}(t)$ denote the total number of eager customers that are blocked, in the pre-limit, during the time interval $\left[0,t\right]$ and $N_{A}^{\mu_{\epsilon}}(t)$ be the total number of eager customers arrived in the same time interval. Moreover, let $N_{B_{j}}^{\mu_{\epsilon}}(t)$ and $N_{A_{j}}^{\mu_{\epsilon}}(t)$ respectively denote the total number of eager customers blocked and arrived respectively before time $t$ , in the pre-limit, in the time period during which the sub-policy $j$ has been used. Then the overall pre-limit blocking probability of the eager class is given by the following (the time limits exist as shown by Theorem 7 and the arguments given below), which can be split as below:

[TABLE]

Consider the following difference,

[TABLE]

and we are done if we can show that the above difference tends to zero as $\mu_{\epsilon}\to\infty$ . We first consider the second term (number of losses are upper bounded by number of arrivals, $N_{B_{j}}^{\mu_{\epsilon}}(t)\leq N^{\mu_{\epsilon}}_{A_{j}}(t)$ ):

[TABLE]

In the above, the inequality ‘a’ follows from PASTA (Poisson Arrivals See Time Averages). Further, for the limit system

[TABLE]

Therefore, for every $\delta>0$ , there exists $\bar{g}$ , such that for all $j\geq\bar{g}$ ,

[TABLE]

Fix this ${\bar{g}}$ . From Theorem 7 we have,

[TABLE]

and hence there exists a $\bar{\mu_{1}}$ , such that

[TABLE]

Further from Theorem 2,

[TABLE]

and thus there exists a $\bar{\mu_{2}}$ , such that for all ${\mu_{\epsilon}}\geq\bar{\mu_{2}}$ ,

[TABLE]

Choose, $\bar{\mu}=\max\left(\mu_{1},\mu_{2}\right)$ . First using, (38) and (40) in (37) and then (37) and (39) in (36), we get, for every $\delta>0$ , there exists $\bar{\mu}$ such that,

[TABLE]

Theorem 7.

Fix a sub-policy $j$ , then the time limit $\lim_{t\to\infty}\frac{N_{B_{j}}^{\mu_{\epsilon}}(t)}{N_{A}^{\mu_{\epsilon}}(t)}$ exists almost surely for all $\mu_{\epsilon}\geq{\bar{\mu}}$ , where ${\bar{\mu}}$ is given in Lemma 2. Further we have

[TABLE]

Proof.

For the purpose of almost sure comparison we construct the ${\epsilon}$ -process influenced by dynamic decisions of the scheduling policy depending upon the ${\tau}$ -state in the following manner. This procedure is the natural extension of the procedure used in [4], which is also summarized in Appendix A.

•

For every sub-policy $j$ construct one sample path of $\epsilon$ -evolution (basically sequence of ${\epsilon}$ -inter-arrival times, and ${\epsilon}$ -service times) that lasts for complete time axis for $\mu_{\epsilon}=1$ and $\lambda_{\epsilon}=\rho_{\epsilon}$ and define the process for any general $\mu_{\epsilon}<\infty$ and for the same sub-policy, using the constructed sample path as explained in Appendix A.

•

Say we begin with the system in ${\tau}$ -state equal to $j$ and ${\epsilon}$ -state equal to 0 (by assumption A.1, ${\epsilon}$ -state is reset to 0 at any ${\tau}$ -transition epoch). Then start ${\epsilon}$ -process of the system with the initial part of ${\epsilon}$ -process corresponding to sub-policy $j$ .

•

We refer the time duration from the start of an ${\epsilon}$ -idle period and the end of the consecutive $\epsilon$ -busy period as one $\epsilon$ -cycle (note that this ${\epsilon}$ -cycle is stochastically same as the ${\epsilon}$ -busy cycle defined in Appendix A). The ${\epsilon}$ -process of the system starts with first full ${\epsilon}$ -cycle corresponding to sub-policy $j$ and will use some initial number of full cycles (the number can be greater than or equal to zero) and it may also use partially the cycle following those full cycles (till the instance of first ${\tau}$ -transition).

•

Once the sub-policy has to switch, due to ${\tau}$ -transitions (to state $j-1$ or $j+1$ ), start using a required part of the ${\epsilon}$ -process corresponding to the new sub-policy. Always start from the first new ${\epsilon}$ -cycle that is previously not used. This process is appropriate in view of assumption **A.**1: the existing ${\epsilon}$ -customers are dropped at every ${\tau}$ -transition and the ${\epsilon}$ -process starts afresh.

•

Note that the end part of any interrupted ${\epsilon}$ -cycle (a partial ${\epsilon}$ -cycle), interrupted by a ${\tau}$ -transition, is not used for any further construction. Thus this full ${\epsilon}$ -cycle can be used to upper bound the partial ${\epsilon}$ -cycle just before the ${\tau}$ -transition, when required. Further, this upper bound (cycle) is independent of the other ${\epsilon}$ -cycles.

Recall by the way of construction, when full ${\epsilon}$ -cycle is used (notations as in Appendix A and as in Theorem 3):

[TABLE]

We consider positive recurrent systems, i.e., systems with $\mu_{\epsilon}\geq{\bar{\mu}}$ , the threshold beyond which the system is stable as given by Lemma 2. For such systems, all the time limits mentioned in this proof exist and this existence is proved along with the rest of the arguments, in the following.

Refer the time interval from start of state $j$ and before the forward or backward transition (to state $j-1$ or $j+1$ ), by $\tau$ -state process $X_{\tau}(\cdot)$ , as a $j$ -cycle. Let $\Lambda_{j}(t)$ represent the total time consisting of full $\epsilon$ -cycles during time period $[0,t]$ , such that the $\tau$ -state is $j$ . This is composed of portions of all $j$ -cycles that intersect with $[0,t]$ , and, obtained after removing the partial $\epsilon$ -cycles at the end of each ${\tau}$ -state transition (occurring in those $j$ -cycles). Note by assumption A.1 a new ${\epsilon}$ -cycle starts at every ${\tau}$ -transition epoch. For $j\geq 1$ , let

[TABLE]

represent the portion of time spent in $j$ -cycles before $t$ and define $\Psi_{j}(t):=F_{j}(t)-\Lambda_{j}(t)$ . The portion $\Psi_{j}(t)$ is basically made up of the residual ${\epsilon}$ -cycles, that are ongoing at the end of ${\tau}$ -state transitions in all $j$ -cycles before $t$ . Thus999All the limits mentioned below exist for all $\mu_{\epsilon}\geq{\bar{\mu}}$ (the threshold of Lemma 2), because the process for any such $\mu_{\epsilon}\geq{\bar{\mu}}$ is positive recurrent. This is because all the time averages can be written as the time averages in an appropriate renewal process and renewal reward theorem (RRT) can be applied, thanks to assumption A.4. For example time limit of $N_{\partial_{j}}(t)/{t}$ exists almost surely, using RRT with $j$ -cycles as renewal periods and with the reward in a cycle as the expected number of ${\epsilon}$ -customers dropped at ${\tau}$ -transition epochs with ${\tau}$ -state equal to $j$ . ,

[TABLE]

where $N_{\partial_{j}}(t)$ is the number of ${\epsilon}$ -jobs dropped at ${\tau}$ -transition epochs with ${\tau}$ -state equal to $j$ (see assumption A.1). We evaluate each of these components one after the other. Note that we are considering the case: $j\geq 1$ . The proof follows in exactly the similar way even for $j=0$ state, but would need obvious changes ( ${\tau}$ -job sizes should not be considered).

First fraction in RHS (right hand side) of (41). By memoryless property of Poisson process representing the $\epsilon$ -arrival process,

[TABLE]

On keen observation, it is easy to realize that $\Lambda_{j}(t)$ is made up of i.i.d. $\epsilon$ -cycles, and hence is a renewal process. Thus one can analyze it using Renewal Reward theorem (RRT). These ${\epsilon}$ -cycles are the special cycles, in that these101010By memoryless property of exponential service times and arrival processes. are the cycles in which the $\tau$ -state has not changed, i.e., the $\tau$ -service is not completed and during which a $\tau$ -arrival did not occur. Let $U$ represent one such special ${\epsilon}$ -cycle, i.e.:

[TABLE]

where ${\mathcal{B}}_{j}^{\mu_{\epsilon}}={\mathcal{B}}_{j}^{1}/\mu_{\epsilon}$ represents (see (17)) an ${\epsilon}$ -cycle, $\mathcal{S}^{\mu_{\epsilon}}_{j}=\mathcal{S}^{1}_{j}/{\mu_{\epsilon}}$ (see (18)) represents the server time available to the $\tau$ -customers during this period and $A_{\tau}$ , $B_{\tau}$ respectively represent the inter-arrival time before next $\tau$ -arrival and the service time of current $\tau$ -job (the residual ones equal the actual ones by memoryless property). Let ${\mathcal{I}}^{\mu_{\epsilon}}$ represent the following indicator random variable:

[TABLE]

It is clear that the expected length of the renewal cycle (by conditioning on $({\mathcal{B}}_{j}^{\mu_{\epsilon}},\mathcal{S}^{\mu_{\epsilon}}_{j})$ and using (19), (17)),

[TABLE]

for any given $j$ and for any ${\mu_{\epsilon}}<\infty$ . A.4, we also have $E[U]<\infty.$ Thus by resorting to RRT two times we have, as $t\to\infty$ ,

[TABLE]

Therefore,

[TABLE]

The last equality follows by way of construction (details in (19) and more details are in [4]). It is clear from (44) that the indicators ${\mathcal{I}}^{\mu_{\epsilon}}\to 1$ almost surely as $\mu_{\epsilon}\to\infty$ ,

[TABLE]

and $N_{B_{j}}^{1}({\mathcal{B}}_{j}^{1})$ is integrable (by A.4). Similar conclusions follow for number of arrivals $N_{A}^{1}$ . Thus by Dominated convergence theorem as $\mu_{\epsilon}\to\infty$ ,

[TABLE]

the last equality follows by applying RRT to a system that implements $j$ -th sub-policy for ${\epsilon}$ -class in ${\tau}$ -static manner.

Now we consider the second and last/fourth term of the RHS of (42). Using similar logic (RRT, DCT) as before and by the way of construction as in equation (19):

[TABLE]

The last equality follows because the rate of arrivals in $\mu_{\epsilon}=1$ system equals $\rho_{\epsilon}$ .

Consider the remaining term (third term) of the RHS of (42). The fraction of time spent by $\tau$ -customer in $j$ -cycle is $F_{j}(t)=\Lambda_{j}(t)+\Psi_{j}(t)$ , therefore

[TABLE]

For all $\mu_{\epsilon}\geq{\bar{\mu}}$ we have:

[TABLE]

where $\pi_{j}^{\mu_{\epsilon}}$ is the stationary probability that ${\tau}$ -queue is in state $j$ with $\mu_{\epsilon}$ system.

By Lemma 9, as ${\mu_{\epsilon}}\to\infty$

[TABLE]

Therefore, from Equation (45),

[TABLE]

Overall, we get from Equation (42),

[TABLE]

Second fraction of Equation (41): One can split the second fraction111111The time limit of the LHS exists, because time limit of $\Psi_{j}(t)/t$ exists as discussed in the proof of Lemma 9. as below:

[TABLE]

In the above the end residual ${\epsilon}$ -cycles (which form $\Psi(t)$ ), belonging to the same $j$ -group are i.i.d., can by upper bounded by the corresponding full i.i.d. ${\epsilon}$ -cycles (which form ${\tilde{\Psi}}_{j}(t)$ as explained in the proof of Lemma 9). We refer any typical (upper bounding) full ${\epsilon}$ -cycle by ${\tilde{\mathcal{B}}}_{j}$ . Then by RRT, we obtain the following for the first and last terms of the RHS of equation (46):

[TABLE]

By construction we have: $N^{{\mu_{\epsilon}}}_{A}(\mathcal{B}_{j}^{\mu_{\epsilon}})=N^{1}_{A}({\mathcal{B}}_{j}^{1})$ (see (19)) and hence:

[TABLE]

As $\mu_{\epsilon}\to\infty$ , using the L’Hopitals rule

[TABLE]

as the limit is a finite constant, by A.4 (note $\mathcal{S}_{j}^{1}\leq\mathcal{B}_{j}^{1}$ a.s., $N^{1}_{B_{j}}\leq N^{1}_{A}$ etc). Using above and equation (46) and Lemma 9, we get,

[TABLE]

Third fraction of RHS of (41): By assumption A.3, one can serve (say) at maximum $\mathcal{K}$ number of ${\epsilon}$ -customers at any time. Let $M(t)$ represent the total number of $\tau$ -transitions during time $t$ . Then one can upper bound the number of losses, $N_{\partial_{j}}(t)$ , during ${\tau}$ -state transition as:

[TABLE]

since one serves at maximum $\mathcal{K}$ ${\epsilon}$ -customers at any time and hence the maximum drops per transition equals $\mathcal{K}$ . As in the proof of Lemma 9, $\lim_{t\to\infty}M(t)/t\leq 2\lambda_{\tau}$ . Thus

[TABLE]

Combining the three fractions of RHS of (41) we have the result:

[TABLE]

Hence the proof of the lemma.

∎

Lemma 9.

Fix any $j$ . Then

[TABLE]

Further, as $\mu_{\epsilon}\to\infty,$

[TABLE]

Proof: Let $M_{j}(t)$ represent the total number of transitions to the state $j$ during time $[0,t]$ . Then,

[TABLE]

with ${\check{\mathcal{B}}}_{j,i}$ being the residual $\epsilon$ -cycle before the end of the $i$ -th ${\tau}$ -transition, occurred during $j$ -cycles. By Lemma 2 there exists ${\bar{\mu}}<\infty$ and the process is stable for all $\mu_{\epsilon}\geq{\bar{\mu}}$ and it is easy to see that limit $t\to\infty$ of the above term exists ( $\{{\check{\mathcal{B}}}_{j,i}\}_{i}$ are i.i.d. for any given $j$ ) almost surely for all such $\mu_{\epsilon}.$ Further one can upper bound as below:

[TABLE]

with $\mathcal{B}_{j,i}$ being the full ${\epsilon}$ -cycle upper-bounding the residual $\epsilon$ -cycle before the end of the $i$ -th ${\tau}$ -transition (as discussed in the beginning of the proof of Theorem 7), occurred during $j$ -cycles. Suffices to prove that the upper bound converges to zero, we rename $\Psi_{j}(t)$ to represent the upper bound itself. Note that $\{\mathcal{B}_{j,i}\}_{i}$ belonging to a particular $j$ -cycle are i.i.d.. If $M_{j}(t)$ converges to a finite constant as $t\to\infty$ , then clearly the proof of the Lemma is done for such sample paths.

Now consider the sample paths in which, as $t\to\infty$ , $M_{j}(t)\to\infty$ . Then by using LLN, as $t\to\infty$ , we get,

[TABLE]

where $M(t)$ represent the the total number of ${\tau}$ -transitions during time $t$ , i.e., $M(t)=\sum_{j}M_{j}(t)$ . With $N_{A}^{\tau}(t)$ representing the number of ${\tau}$ -arrivals during interval $[0,t]$ and by noting that the number of ${\tau}$ -departures at maximum equal the number of arrivals, we clearly have:

[TABLE]

Any $\mathcal{B}_{j,i}$ in equation (48) is a special ${\epsilon}$ -cycle, which is interrupted by a ${\tau}$ -transition, either arrival or departure. Thus (with the rest of the notations as in (43)):

[TABLE]

Using the tower property of conditional expectation (e.g., $E[X]=E[E[X|Y]]$ ), by conditioning on $\mathcal{B}_{j}^{\mu_{\epsilon}}$ and $\mathcal{S}_{j}^{\mu_{\epsilon}}$ , and because $A_{\tau}$ , $B_{\tau}$ are exponential random variables with respective parameters $\lambda_{\tau}$ and $\mu_{\tau}$ :

[TABLE]

By substituting the above inequality in (49) and further referring to (50), we have the proof of the first inequality of Lemma. For the last equality, recall $\mathcal{B}_{j}^{\mu_{\epsilon}}=\mathcal{B}_{j}^{1}/\mu_{\epsilon}$ and $\mathcal{S}_{j}^{\mu_{\epsilon}}=\mathcal{S}_{j}^{1}/\mu_{\epsilon}$ .

By L’Hopitals rule, and using the differentiability properties of the moment generating function, we get:

[TABLE]

Thus

[TABLE]

Thus, we get from Equation (49),

[TABLE]

$\blacksquare$

Appendix D Proof of Theorem 4

In this appendix, we describe the changes required in the proofs of Theorems 1-3 to establish Theorem 4. Note that we no longer make Assumption A.1, but simply assume that eager job sizes are exponentially distributed.

Without Assumption A.1, one can not describe the pre-limit system using a one-dimensional Markov process. However with exponential service times for both eager and tolerant customers, one can describe the system evolution via a two-dimensional Markov chain $(X_{\tau}(t),X_{\epsilon}(t)),$ capturing the number of tolerant and eager customers in the system. Note that this Markov chain undergoes transitions due to ${\epsilon}$ as well as ${\tau}$ customers. As noted before, when the occupancy of the tolerant changes, a new eager sub-policy comes into effect immediately, potentially during an ongoing eager busy period. This new sub-policy might result in scheduling rearrangements, based on the number of eager customers in the system.

The main difference without Assumption A.1 is that once the $\tau$ occupancy changes, the first ${\epsilon}$ -busy cycle can start with (random) $c$ number of customers, where $0\leq c\leq{\cal K}:=\lfloor 1/c_{min}\rfloor<\infty.$ Let the duration of this partial first ${\epsilon}$ -busy cycle be represented by $\mathcal{B}_{j}^{{\mu_{\epsilon}},c}$ , when the system is operating under $j$ -th sub-policy; basically, this is the time period that starts immediately after the ${\tau}$ -system changes to state $j$ , and lasts till the ${\epsilon}$ -system becomes empty (or till the ${\tau}$ -state changes again, whichever happens first).121212Recall that complete ${\epsilon}$ -busy cycles start when the ${\epsilon}$ -system becomes empty and last till the end of the next ${\epsilon}$ -busy period. Aside from this first partial ${\epsilon}$ -busy-cycle, subsequent ${\epsilon}$ -busy cycles are referred to using the same notations as before.

While the two-dimensional Markovian description $(X_{\tau}(t),X_{\epsilon}(t)),$ with a random number $c$ of eager customers present at times of ${\tau}$ -transitions is more challenging to analyse, we show below that the first dimension of this process can be upper and lower bounded by one-dimensional Markov processes (as in [4]). These bounding processes are described next.

Upper and Lower bounding systems: Now we construct two systems for each ${\mu_{\epsilon}}$ , which sandwich the actual ${\tau}$ -system pre-limit. In both the bounding systems we consider that the eager system, immediately after each tolerant change, starts with maximum possible (partial) eager busy cycle: generate one busy cycle $\mathcal{B}_{j}^{{\mu_{\epsilon}},c}$ starting with each $c$ and with coupled eager arrivals, job-sizes etc., for the given sub-policy (say sub-policy $j$ ) and let $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}:=\max_{0\leq c\leq{\cal K}}\mathcal{B}_{j}^{{\mu_{\epsilon}},c}$ be duration for which the first (hypothetical) eager busy cycle lasts in the two systems. It is immediate that this max-busy cycle upper bounds those starting with any $c$ -number (as in original system), when they are served with the same sub-policy.

In the upper bounding system, immediately after the ${\tau}$ -occupancy changes to $j$ , the tolerant queue receives zero service for duration $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}$ . After this period, service of the tolerant queue in subsequent ${\epsilon}$ busy cycles continues exactly as in the original system. The lower bounding system is almost similar to the upper bounding system, except that the tolerant queue is served at the maximum possible rate (1, given the unit server speed assumed) over the first hypothetical partial busy cycle $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}.$ We then use appropriate coupling rules as in [4]:

$\bullet$ Tolerant inter-arrival times and job-sizes are always coupled in all three systems. However, the eager quantities are coupled in the original and any bounding system, only when the two systems are in the same ${\tau}$ -state, as explained below.

$\bullet$ The original and the two bounding systems start with the same initial system state;

$\bullet$ the number of ${\tau}$ -customers in the original system is less than or equal to that in upper system, and greater than or equal to that in the lower system, as time progresses (because of the service differences for the tolerant queue in the first partial eager busy-cycles across the three systems);

$\bullet$ if at any time point the upper system meets131313The tolerant number in upper system can become strictly bigger than that in original systems, but further increase may be slower in upper system due to Markov policies, and then at some tolerant departure the ${\tau}$ -number in upper system can equal that in the original system. the original system, we couple the two systems by using eager random variables (inter-arrival times, service times, etc.) that define the original system in that ${\tau}$ -cycle to define the upper system ${\tau}$ -cycle;

$\bullet$ specifically, we start with the maximum partial busy cycle in upper system corresponding to the sub-policy dictated by the new tolerant state, after coupling (also) the eager inter-arrival times and job sizes; observe that under this construction, the first ${\epsilon}$ -busy cycle in the original system (in the ${\tau}$ -cycles in which upper system meets the original system) is smaller than or equal to the corresponding maximum partial busy cycle in upper system.

These details are similar to those in [4], where one can also find more details. Similar coupling ideas can be used to construct the lower bounding system.

It is easy to observe that the tolerant queue evolution in the two bounding systems can be modeled as M/G/1 systems with state dependent birth-death transition probabilities, as the system considered in the paper with Assumption **A.**1. That is, they can be represented by one-dimensional Markov processes. All the results can be extended to the two bounding systems and we would have equivalents of Theorems 1-2 and Lemma 2 for the two systems, if we can show that the impact of the first maximum partial busy cycle at the start of every tolerant change vanishes as ${\mu_{\epsilon}}\rightarrow\infty$ under the SFJ limit. We show below that this is the case.

Convergence of the bounding systems: This proof goes through along similar lines as with A.1, except for few important differences. We only mention the differences here. We first show that the max busy cycles $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}$ converge to zero, as $\mu_{\epsilon}\to\infty$ , the moments are upper bounded by a common uniform bound, etc. For extending Lemmas 3-4, we observe that the max busy cycles can be uniformly upper-bounded (uniformly over sub-policy $j$ ) by that in the M/G/ $\infty$ queue of Lemma 4, which starts with ${\cal K}$ customers.

For extending Lemma 1, the transition probabilities also depend upon max busy cycles. Towards this, first define $q_{j}^{{\mu_{\epsilon}},*}$ to represent the backward transition probability of tolerant birth-death process when one starts it with corresponding max busy cycle. Then one can lower bound the probability $q_{j}^{{\mu_{\epsilon}},*}$ using

[TABLE]

with ${\bar{q}}_{j}$ as defined previously and where $\mathcal{S}_{j,1}^{{\mu_{\epsilon}},*}$ equals the service available to the tolerant queue during the first ${\epsilon}$ max busy cycle $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}$ (which exactly equals 0 for upper system and $\mathcal{B}_{j}^{{\mu_{\epsilon}},*}$ in the lower system for any ${\mu_{\epsilon}}$ ). It is clear that $r_{*}$ converges to zero (uniformly in $j$ ) and the remaining proof goes through, as before. The proof for the upper bound of Lemma 1 can be handled similarly. Observe here that Lemmas 5-6 are used by Lemma 1 only for the subsequent ${\epsilon}$ -busy cycles, which are same as that with Assumption A.1, and hence would not require any changes. The proof of Lemma 7 holds also for the two bounding systems, because the convergence in Lemma 1 is ensured and same is the case with Lemma 2.

Thus, by the extended Lemma 2 there exists ${\bar{\mu}}$ such that the upper and lower bounding systems are stable for all ${\mu_{\epsilon}}\geq{\bar{\mu}}$ . Further the upper and lower stationary distributions converge to the same quantity as ${\mu_{\epsilon}}\to\infty$ (by extended versions of Theorems 1-2).

Existence of stationary distribution of original system: We derive the existence using embedded chains, the discrete time Markov chains obtained by considering values of the continuous time Markov process, immediately after the transition epochs.

By extended Lemma 2 there exists a ${\bar{\mu}}$ such that the upper bounding system is stable for all ${\mu_{\epsilon}}\geq{\bar{\mu}}$ . For all such ${\mu_{\epsilon}}$ , the embedded tolerant chain in the upper system visits the [math] state infinitely often (with probability one) and within integrable number of epochs (by positive recurrence and existence of stationary distribution). This implies (by the bounding of the tolerant occupancy between the original and upper system) that the embedded tolerant chain in the original system also visits [math] state infinitely often (with probability one) and within an integrable number of epochs. This in turn implies the original (two-dimensional) embedded Markov chain visits the set ${\cal O}:=\{(0,c),\mbox{ for some }c\leq{\cal K}\}$ infinitely often (w.p.1) and within integrable number of epochs. By irreducibility, this implies the same for the $(0,0)$ state, as every time the chain visits any state of the form $(0,c),$ it has a positive probability of visiting state $(0,0)$ (uniformly bounded away from 0 across all $c$ ) and hence visits state $(0,0)$ after at most a geometric number of visits to the set ${\cal O}.$ This establishes the existence of a unique stationary distribution for the original two-dimensional embedded Markov chain, which in turn implies the existence of stationary distribution for the two dimensional continuous time Markov process; observe that the transition rates between all the state pairs can be uniformly upper bounded by a finite value for any given ${\mu_{\epsilon}}.$

By regular renewal arguments, this implies almost sure existence of the time limits:

[TABLE]

where $\pi_{j}^{\mu_{\epsilon}}$ is the marginal of the stationary distribution corresponding to tolerant queue in original (two-dimensional) system.

Convergence of stationary distribution as ${\mu_{\epsilon}}\to\infty$ : The limit of the ${\tau}$ -performance of the two bounding systems, as ${\mu_{\epsilon}}\to\infty$ is exactly the same, because by extended Lemma 1 both the sets of transition probabilities converge to the same limit transition probabilities uniformly. Furthermore, time averages corresponding to the ${\tau}$ -occupancy in the original system can be bounded by the corresponding time averages in the upper system (occupancy denoted by $X_{\tau}^{U}(\cdot)$ ) and the lower system (occupancy denoted by $X_{\tau}^{L}(\cdot)$ ) as follows:

[TABLE]

and hence the marginal stationary distribution (corresponding to tolerant occupancy) of the original system, converges to the common limit of the two bounding systems. This implies the statement related to the tolerant performance in Theorem 1, because (for each $j$ ):

[TABLE]

Using similar bounding arguments, and common limit of the two bounding systems, the statement of Theorem 2 also gets extended.

Changes required for results related to eager performance: There are no changes in the main proof of Theorem 3, we only need to discuss the changes required in the proof of Theorem 7.

For extending the proof of Theorem 7, one needs to construct additional sample paths for coupling arguments. One first needs to construct ${\cal K}+1$ additional sample paths of ${\epsilon}$ -busy cycles, with ${\mu_{\epsilon}}=1$ , one for each sub-policy $j,$ and one for each $0\leq c\leq{\cal K}$ . Using these ${\cal K}+1$ sample paths, we need to construct a $({\cal K}+2)$ th sample path of max busy cycles, one for each $j$ . Note again that each such max busy cycle upper bounds the corresponding busy cycle starting with $c$ number of ${\epsilon}$ -customers, for any $0\leq c\leq{\cal K}$ . Whenever the ${\tau}$ -state changes, the first ${\epsilon}$ -busy cycle is coupled using one of the above ${\cal K}+1$ sample paths, depending upon the random $c$ , that represents the number of eager customers at the time of the ${\tau}$ -transition. Each such first partial busy cycle is is upper bounded by the corresponding max busy cycle, irrespective of the random $c$ ; this ensures the required coupling (and uniform upper bounding across various $c$ ) across various ${\mu_{\epsilon}}$ , even without Assumption A.1.

The process $\Psi_{j}(t)$ for each $j$ -sub-policy now includes the starting as well as the ending partial busy cycles. But all the starting partial ${\epsilon}$ -busy cycles can be upper bounded by the corresponding max busy cycles and one can show the influence of them to converge to zero, in the same manner as is done with the ending busy cycles; this is possible because we have already extended the proofs of the required uniform bounds (converging to zero or second moments bounded as required) on all the required moments of the max busy cycles, as already discussed in the initial parts of this section. All the bounds that were shown to converge to zero, should now be made up of the corresponding max busy cycles, to handle the part of $\Psi_{j}(t)$ coming from the starting partial ${\epsilon}$ -busy cycles. The same is true for the bounds and arguments considered in the proof of Lemma 9.

If there any drops due to change in eager sub-policy at tolerant transition epochs, they can be upper bounded by $N_{\partial_{j}}(t)$ and can be handled as in original proof. $\blacksquare$

Appendix E Proofs of Limit Pareto Frontier

Proof of Theorem 6: We prove this theorem by proving the following steps:

(a) The optimizing problem defined by (12) is equivalent to the following modified optimizing problem in terms of blocking probabilities $\{d_{i}\}_{i\geq 0}$ :

[TABLE]

(b) Let $\phi^{*}$ be an optimizing policy. Then, constraint is satisfied with equality at $\phi^{*}$ .

(c) Consider any policy $\phi=(d_{0},d_{1}\cdots)$ . For any $i$ , if $d_{i}>{\underline{d}}$ and $d_{i+1}<{\bar{d}}$ , then there exists a policy ${\tilde{\phi}}$ which strictly improves upon $\phi$ , i.e.,

[TABLE]

d) From part (c), it is clear that the optimal policy $\phi^{*}$ is monotone, i.e., $d_{i}^{*}\leq d_{i+1}^{*}$ for any $i$ . In fact from part (c), optimal policy $\phi^{*}=\{d_{1}^{*},d_{2}^{*}\cdots\}$ is of the type as given below:

[TABLE]

for some $1\leq L<\infty$ and ${\underline{d}}\leq d\leq{\bar{d}}.$ We then identify $(L,d)$ for the given $C$ in the last step.

We now provide the proof of all the steps, one after the other.

Proof of Part (a): Consider the SM policy $\phi=\left(d_{0},d_{1},\cdots,d_{n},\cdots\right)$ . We rewrite the optimization problem (12), here, for convenience:

[TABLE]

In other words, we have to optimize:

[TABLE]

As $\rho_{\epsilon},$ ${\mu_{\tau}}$ and $c$ are constants, optimizing above is equivalent to

[TABLE]

For all $i\geq 1$ , as $c+{\mu_{\tau}}\rho_{\epsilon}d_{i}=\lambda_{\tau}/\rho_{i}^{\phi}=\lambda_{\tau}h_{i-1}^{\phi}/h_{i}^{\phi}$ , minimizing the above problem is equivalent to minimize the following problem (note that $\lambda_{\tau}$ is a constant and $h_{0}^{\phi}=1$ ),

[TABLE]

It is immediate to show that $d_{0}^{*}={\underline{d}}$ (only $P_{B}$ depends upon $d_{0}$ and $E[N]$ is independent of it) , thus equivalently we consider the following:

[TABLE]

The above is equivalent to the following simplified version:

[TABLE]

This proves the first part.

Proof of Part (b) Consider any SM policy $\phi\coloneqq\left({\underline{d}},d_{1},\cdots,d_{i},\cdots\right)$ (note $d_{0}={\underline{d}}$ is fixed at ${\underline{d}}$ in view of Part(a)). Since there is one-one onto relation between $\rho_{j}^{\phi}$ and $d_{j}$ (as $\rho_{j}^{\phi}=\lambda_{\tau}/(c+\mu_{\tau}\rho_{\epsilon}d_{j})$ ) for every $j$ , we consider optimization with respect to $\rho_{j}$ ’s and redefine the policy as, $\phi\coloneqq\left({\bar{\rho}},\rho_{1}^{\phi},\cdots,\rho_{i}^{\phi},\cdots\right)$ , where ${\bar{\rho}}$ corresponds minimum blocking ${\underline{d}}$ (given by (13)). This redefinition is only for this proof. Thus one can re-write the optimization problem as (since $\rho_{0}={\bar{\rho}}$ is fixed):

[TABLE]

Let $\phi^{*}=\left(\rho_{1}^{*},\rho_{2}^{*},\cdots,\rho_{i}^{*},\cdots\right)$ represent the optimal policy of the optimizing problem defined above. We want to show that constraint of the optimization problem is satisfied at $\phi^{*}$ . Let us assume on contrary that constraint is not satisfied at $\phi^{*}$ , that means $g(\phi^{*})<0$ . Then there exists an $i$ such that $\rho_{i}^{*}<{\bar{\rho}}$ , as $C<{\bar{\rho}}/(1-{\bar{\rho}})$ .

First, consider the case when there exists one such $i$ , and further $i\geq C$ . By Lemma (11), we can get a policy $\phi^{\epsilon}$ in the feasible region which $f(\phi)<f(\phi^{\epsilon})$ . This contradicts the fact that $\phi^{*}$ is an optimal policy. The contradiction arises because of the wrong assumption that constraint is not satisfied at $\phi^{*}$ .

Next, consider the case when $i<C$ . Define $\bar{i}$ as

[TABLE]

Clearly, $0<\bar{i}\leq C$ and note $\rho_{\bar{i}-1}<\rho_{\bar{i}}$ . Using Lemma (10), we can get an improved policy by swapping $\rho_{\bar{i}-1}$ and $\rho_{\bar{i}}$ . This again leads to the contradiction that $\phi^{*}$ is an optimal policy and therefore, constraint is satisfied at the optimal policy.

Proof of Part (c) and (d): Consider the case where ${\underline{\rho}}<\rho^{\phi}_{i+1}$ and $\rho^{\phi}_{i}<{\bar{\rho}}$ for some $i$ . Then we claim that, there exists an alternate policy, say $\tilde{\phi}$ , which is better than $\phi$ , i.e., such that $f(\phi)<f(\tilde{\phi})$ and $g(\phi)\geq g(\tilde{\phi})$ .

The alternate policy, say ${\tilde{\phi}}:=\{\tilde{\rho_{i}}\}_{i\geq 0}$ , would defer only in $i$ and $i+1$ components, i.e., such that $\tilde{\rho_{j}}=\rho^{\phi}_{j},\ \ \forall j\neq i,i+1$ and $\tilde{\rho_{i}}=\rho^{\phi}_{i}+\epsilon,\tilde{\rho_{i+1}}=\rho^{\phi}_{i+1}-\tilde{\epsilon}$ , for some positive $\epsilon$ and $\tilde{\epsilon}$ . And we would need the following to make it a valid policy:

[TABLE]

First, consider the difference in the objective functions under the two policies:

[TABLE]

In a similar way, the difference in the constraints is:

[TABLE]

The above difference is equal to 0, i.e., the constraint function remains the same under $\tilde{\phi}$ , if the following holds,

[TABLE]

With such a policy the difference in objective functions (see equation (60)) equal:

[TABLE]

Case with $i>C$ : From the above equation, it is clear that we need $\tilde{\rho_{i}}-\rho^{\phi}_{i}>0$ , for improvement in the objective function. That is, we want $\epsilon>0$ (this is obviously true). Now, for all $\epsilon>0$ , we get an $\tilde{\epsilon}$ strictly positive as,

[TABLE]

If $i>C$ (more specifically if the last term in the RHS is positive), the above ${\tilde{\phi}}$ provides the required improvement policy as one can chose $\epsilon,{\tilde{\epsilon}}>0$ sufficiently small to satisfy (59).

Case with $i<C$ : Further, if $\rho^{\phi}_{i}<\rho^{\phi}_{i+1}$ , then by Lemma 10 we can get the required improved policy. We are left with the case when $i<C$ and $\rho^{\phi}_{i}=\rho^{\phi}_{i+1}$ with ${\underline{\rho}}<\rho^{\phi}_{i}<{\bar{\rho}}$ . If now the numerator of (62) is negative, i.e., if

[TABLE]

one can improve $\phi$ by considering a ${\tilde{\phi}}$ that differs from $\phi$ only in $i$ -th component, and with ${\tilde{\rho}}_{i}={\bar{\rho}}$ . With such a ${\tilde{\phi}}$ , a) the constraint remains satisfied, as (by substituting ${\tilde{\rho}}_{i+1}=\rho^{\phi}_{i+1}$ in (61))

[TABLE]

and b) the difference in the objective function becomes positive (by substituting ${\tilde{\rho}}_{i+1}=\rho^{\phi}_{i+1}$ in (60)):

[TABLE]

Thus, we get an improved policy. From (62), it further suffices to consider the case when,

[TABLE]

Note that the second term in the above is the numerator of the first term and hence for the conditions to hold the denominator of the first term should be negative, i.e.,

[TABLE]

But with $i<C$ , both the components of the second term of (64) is negative implying the second term should be negative. Thus such a case does not exist. Thus in all cases, we have an alternate policy which strictly improves $\phi$ .

Optimal Policy

In view of the above result it is clear that the optimal policy is of the type (14). Further it is easy to solve the given optimization problem for any $C\leq{\bar{\rho}}/(1-{\bar{\rho}})$ as given below: Define $L^{*}=L^{*}(C)$ in the following manner

[TABLE]

$\blacksquare$

Lemma 10.

**Improvement Through Swapping

For $i<C$ , if $\rho_{i}^{\phi}<\rho_{i+1}^{\phi}$ in a SM policy $\phi$ for the optimization problem defined in part (a) of Theorem 6, then we can get an policy, say $\phi^{swap}$ , by swapping the $i$ and $i+1$ -th components in $\phi$ , such that:**

[TABLE]

That means, we can improve the solution of the optimization problem by defining $\phi^{swap}=\left(\rho^{swap}_{1},\rho^{swap}_{2},\cdots,\rho^{swap}_{i},\cdots\right)$ as follows:

[TABLE]

Proof.

The term $h^{\phi^{swap}}_{j}$ corresponding to the policy $\phi^{swap}$ , will be given as:

[TABLE]

First, consider the difference between the objective function corresponding to these two policies is given by:

[TABLE]

Since, $\rho^{\phi}_{i}<\rho^{\phi}_{i+1}$ it is clear from above equation that, $f(\phi^{swap})>f(\phi)$ . Therefore, the objective function improves.

Next, consider the difference between the constraints,

[TABLE]

From above equation, using $\rho^{\phi}_{i}<\rho^{\phi}_{i+1}$ and $i<C$ , we conclude that $g(\phi^{swap})-g(\phi)<0$ . That means $g(\phi^{swap})<g(\phi)$ and hence we get an improved solution. ∎

Lemma 11.

**Improvement Through Adding $\epsilon$

Consider a SM policy $\phi$ for the optimization problem defined in part (a) of Theorem 6. For $i\geq C$ , if there exists an $\rho^{\phi}_{i}<{\bar{\rho}}$ , then we can get a policy, say $\phi^{\epsilon}$ , by adding $\epsilon>0$ to the $i$ -th component of policy $\phi$ , such that:**

[TABLE]

Proof.

Since $C<C_{max}=\frac{{\bar{\rho}}}{1-{\bar{\rho}}}$ , there exists at least one $i$ such that $\rho^{\phi}_{i}<{\bar{\rho}}$ . Let $\rho^{\phi}_{i}<{\bar{\rho}}$ . Define a policy $\phi^{\epsilon}=\left(\rho^{\phi^{\epsilon}}_{1},\rho^{\phi^{\epsilon}}_{2},\cdots,\rho^{\phi^{\epsilon}}_{i},\cdots\right)$ as follows:

[TABLE]

Define, $h_{0}^{\phi^{\epsilon}}=1,$ and for $i\geq 1$ , $h_{i}^{\phi^{\epsilon}}=\prod_{1\leq j\leq i}\rho_{j}^{\phi^{\epsilon}}$ . First consider the difference in the objective function under the two policies:

[TABLE]

Objective function will improve, if the above difference between the objective function is positive. Clearly, the first and last term in the above equation is positive. So, to make the difference positive we require the middle term positive, i.e., $(\rho_{i}^{\phi^{\epsilon}}-\rho^{\phi}_{i})>0$ . This is obviously true as $(\rho_{i}^{\phi^{\epsilon}}-\rho^{\phi}_{i})=\epsilon$ , which is positive.

Now we claim that the policy $\phi^{\epsilon}$ is in feasible region, as

[TABLE]

$g(\phi)\leq 0$ and we can choose $\epsilon>0$ small enough such that the overall term in the RHS of above equation will remain negative.

Thus, we get that the policy $\phi^{\epsilon}$ in the feasible region which also improve the objective function. ∎

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K. Chaudhary, V. Kavitha, and J. Nair, “Dynamic scheduling in a partially fluid, partially lossy queueing system,” in Proceedings of Wi Opt , 2019.
2[2] S. R. Mahabhashyam and N. Gautam, “On queues with markov modulated service rates,” Queueing Systems , vol. 51, no. 1-2, pp. 89–113, 2005.
3[3] V. Kavitha and R. K. Sinha, “Achievable region with impatient customers,” in Proceedings of Valuetools , 2017.
4[4] V. Kavitha, J. Nair, and R. K. Sinha, “Pseudo conservation for partially fluid, partially lossy queueing systems,” Annals of Operations Research , pp. 1–38, 2018.
5[5] A. Sleptchenko, A. van Harten, and M. C. van der Heijden, “An exact analysis of the multi-class m/m/k priority queue with partial blocking,” Stochastic models , vol. 19, no. 4, pp. 527–548, 2003.
6[6] B. Li, L. Li, B. Li, K. M. Sivalingam, and X.-R. Cao, “Call admission control for voice/data integrated cellular networks: performance analysis and comparative study,” IEEE Journal on Selected Areas in Communications , vol. 22, no. 4, pp. 706–718, 2004.
7[7] S. Tang and W. Li, “A channel allocation model with preemptive priority for integrated voice/data mobile networks,” in Proceedings of the First International Conference on Quality of Service in Heterogeneous Wired/Wireless Networks , 2004.
8[8] Y. Zhang, B.-H. Soong, and M. Ma, “A dynamic channel assignment scheme for voice/data integration in gprs networks,” Computer Communications , vol. 29, no. 8, pp. 1163–1173, 2006.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Dynamic scheduling in a partially fluid, partially lossy

Abstract

1 Introduction

Related Literature

2 System Description

2.1 Dynamic schedulers

2.2 τ{\tau}τ-static schedulers and background

2.3 Some example models

3 Performance Characterization under the SFJ limit

3.1 State-dependent service rate M/M/1 queue

3.2 Main results

Theorem 1**.**

Theorem 2**.**

Theorem 3**.**

Theorem 4**.**

3.3 Analysing tolerant performance under the SFJ limit

Lemma 1**.**

Lemma 2**.**

Theorem 5**.**

4 Dynamic Achievable Region

5 Limit Pareto frontier

5.1 Pareto-complete family

Theorem 6**.**

Corollary 1** **(Pareto-Complete family).

5.2 Monte Carlo based case study in pre-limit

6 Concluding remarks

Appendix A Sample path coupling and SFJ limit

Appendix B Tolerant Performance

B.1 Bounding the second moment of ϵ{\epsilon}ϵ-busy cycles

Lemma 3**.**

Proof.

Lemma 4**.**

Proof.

B.2 Proof of Lemma 1

Lemma 5**.**

Proof.

Lemma 6**.**

Proof.

B.3 Proof of Lemma 2

Lemma 7**.**

Proof.

B.4 Proof of Theorem 5

B.5 Proof of Theorem 1

Lemma 8**.**

B.6 Proof of Theorem 2

Appendix C Eager Performance

Theorem 7**.**

Proof.

Lemma 9**.**

Appendix D Proof of Theorem 4

Appendix E Proofs of Limit Pareto Frontier

Optimal Policy

Lemma 10**.**

Proof.

Lemma 11**.**

Proof.

2.2 ${\tau}$ -static schedulers and background

Theorem 1.

Theorem 2.

Theorem 3.

Theorem 4.

Lemma 1.

Lemma 2.

Theorem 5.

Theorem 6.

Corollary 1 (Pareto-Complete family).

B.1 Bounding the second moment of ${\epsilon}$ -busy cycles

Lemma 3.

Lemma 4.

Lemma 5.

Lemma 6.

Lemma 7.

Lemma 8.

Theorem 7.

Lemma 9.

Lemma 10.

Lemma 11.