Redundancy Management for Fast Service (Rates) in Edge Computing Systems
Pei Peng, Emina Soljanin

TL;DR
This paper addresses optimizing resource allocation in edge computing to minimize latency and blocking probability through novel algorithms, considering the trade-offs of redundancy in resource-constrained environments.
Contribution
It introduces new algorithms for blocking probability and system time optimization in edge computing, accounting for resource limitations and dynamic system parameters.
Findings
Algorithms outperform benchmarks in simulations
Optimal replication varies with system parameters
Redundancy reduces latency but increases resource contention
Abstract
Edge computing operates between the cloud and end users and strives to provide low-latency computing services for simultaneous users. Redundant use of multiple edge nodes can reduce latency, as edge systems often operate in uncertain environments. However, since edge systems have limited computing and storage resources, directing more resources to some computing jobs will either block the execution of others or pass their execution to the cloud, thus increasing latency. This paper uses the average system computing time and blocking probability to evaluate edge system performance and analyzes the optimal resource allocation accordingly. We also propose blocking probability and average system time optimization algorithms. Simulation results show that both algorithms significantly outperform the benchmark for different service time distributions and show how the optimal replication factor…
| Average System Time | System Service Rate | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arrival Rate | Cloud Time | Large | Shift Parameter | Arrival Rate | |||||||||
| Large | Moderate | Small | Large | Moderate | Small | Large | Moderate | Small | Large | Moderate | Small | ||
| Exp | C | S | C | S | C&S | C | C | \ | \ | \ | C | C | C |
| S-Exp | C | S | C | S | C&S | C | C&S | S | C&S | C | S | C&S | C |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Distributed and Parallel Computing Systems
Redundancy Management for Fast Service (Rates) in Edge Computing Systems
††thanks: This work was supported in part by the National Natural Science Foundation of China (Grant No. 62301273), the University Science Research Project of Jiangsu Province (Grant No. 23KJB120009), the Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications (Grant No. NY222008) and the NSF-BSF Award FET-2120262. This article was presented in part the 2023 15th annual International Conference on Wireless Communications and Signal Processing (WCSP)[1].
Pei Peng
Nanjing University of Posts and Telecommunications
Email: [email protected]
Emina Soljanin
Rutgers University
Email: [email protected]
Abstract
Edge computing operates between the cloud and end-users and strives to provide fast computing services for multiple users. Because of their proximity to users, edge services have a low communication delay and can provide low latency with sufficient computing and storage resources. However, edge computing and storage resources are limited. Thus, directing more resources to some computing jobs will block (and pass to the cloud) the execution of others. We evaluate the edge system performance using two metrics: job computing time and job blocking probability. Edge nodes often operate in highly unpredictable environments and handle jobs needing fast or no service. Thus, jobs not getting into service upon arrival get blocked and passed to the cloud. In unpredictable environments, replicating a job to multiple servers when resources allow shortens its computing time. However, such replication makes the resources unavailable to other users, and their execution is blocked. We show that the job computing time decreases with increasing replication factor, but the job blocking probability does not. Therefore, there is a tradeoff. This paper uses the average system time and service rates as performance metrics to evaluate the tradeoff. We conclude that the optimal number replication factor that minimizes the average system time changes with the distribution parameters and the arrival rate.
I Introduction
With the rapid increase in IoT applications, such as smart cities and homes, autonomous vehicles, and artificial intelligence, billions of IoT devices are coming into our everyday lives [2, 3]. The demand for low latency storage and computing services is increasing to accommodate novel IoT platforms (e.g., deep learning) [4, 5, 6]. Some applications, for example, connected and autonomous vehicles, smart healthcare, and ocean monitoring require fast or no service. Cloud services are inefficient at responding to such applications.
Edge computing is an inter-layer between the cloud and the end-user. It provides storage and computing infrastructure at the node located one or two network hops from the end-user [7]. According to [8], the round trip time between the cloud and end-user (about ms) is over times larger than the time between the edge and end-user (about ms). Therefore, in the edge system, the bottleneck of the computing service is no longer the communication delay. As illustrated in Fig. 1, an edge computing system receives and processes jobs from end-users, and on a much smaller scale of storage and computing resources than the cloud [9, 10]. Service requests get sent to the cloud when all edge workers are busy, and communication time becomes a significant part of the delay.
Edge nodes often operate in highly unpredictable environments and handle jobs needing fast or no service. Using resources redundantly is a known strategy to enable fast service. However, with increased resource usage comes a decrease in the number of jobs the edge system can execute simultaneously. Thus, more jobs get sent to the cloud, which increases their execution time. An edge computing system that sends jobs to the cloud once the resources become unavailable acts locally as a blocking system. The jobs sent to the cloud will experience higher latency as they forgo the geographic benefit [11].
We, therefore, need to address two opposing objectives to maximize the benefits of edge computing. The first objective is to minimize the computing speed of each job executed locally at the edge. (Computing speed, expressed in terms of computing time or service rate, is a classical and essential performance metric [12, 13].) The second objective is to minimize the number of jobs processed by the edge system (i.e., the number of blocked jobs sent to the cloud. Due to the limited storage and computing resources, the edge system may be unable to process all jobs independently of the cloud.
We focus on edge computing systems in which a single controller node manages a computing cluster of workers [14]. When a job arrives, the controller replicates it to several workers. Here, we refer to the workers processing the same job as the replication group and its size as the replication factor. Since the total number of workers in the system is limited, we have to decrease the number of groups when we increase the replication factor. The edge computing system with a higher replication factor may have a higher blocking probability. That is, it can process fewer jobs simultaneously. Therefore, processing more jobs locally may decrease the expected computing speed.
Simultaneously minimizing the local job computing time and job blocking probability is generally impossible. Since both are crucial to edge computing, we will focus on finding the tradeoff between these performance metrics. We combine the local edge computing time and the time spent to reach the cloud (which happens with the job-blocking probability at the edge) into a single metric representing the average service time. Recent literature has considered the service rate a critical performance metric, especially in distributed storage systems. For example, [15] evaluates the service rate of a distributed storage system to find an optimal storage allocation strategy. [16] uses the service rate region as an essential consideration in the design of erasure-coded distributed systems. In some scenarios, we will show the benefits of using the service rate instead of the expected time. We will use both metrics to evaluate the tradeoff between the local job computing time and the blocking probability at the edge.
In this paper and [1], we initiate a study of the value of using replication in edge-blocking systems to improve their service rate. (We remind the reader that there are other related crucial edge performance metrics, such as offload latency, power consumption, and energy efficiency, [17, 18, 19, 20], which are beyond the scope of this paper. ) A vast body of literature argues the benefits of redundancy in distributed computing (see, e.g., [21, 22], and references therein), including edge systems [23, 24, 25, 26]. However, because of the adverse effects of temporarily redundantly using more system resources, there are limits to these benefits. The optimal levels of redundancy have not been fully characterized; see, e.g., [27, 28] and references therein, and are the subject of current research.
The paper is organized as follows. In Sec. II, we describe the architecture of the edge computing system and formulate the problem. In Sec. III, we state the problem and summarize the contributions of this paper. In Sec. IV, we theoretically and numerically analyze the job computing time and the job blocking probability changes with the number of groups. In Sec. V and VI, we analyze the optimal number of groups (or replication factor) that minimizes the average system time considering different values of and . In Sec. VII, we maximize the system service rate for the scenario that the time a job spends in the cloud is long. The conclusions are given in Sec. VIII.
II System Model
II-A System Architecture
We consider the edge computing system model shown in Fig. 2 as a combination of the distributed computing system and the blocking system. The edge computing system has limited storage and computing resources[3]. It consists of a single front-end controller node and multiple computing servers, which we refer to as workers. The single controller node manages the entire computing cluster of nodes. This architecture is commonly implemented in modern frameworks, such as Apache Mesos [29], and edge computing systems with limited storage and computing resources [14].
The controller node will divide the workers into several groups. The controller node creates copies for each arriving job and assigns each copy to a worker in a group. In Fig. 2, the controller node sends and its copy to workers 1 and 2 (), and sends and its copy to workers 3 and 4 (). This execution replication mitigates straggling and reduces the job’s expected completion time. The larger the replication factor , the higher the reduction of the job’s average completion time.
On the other hand, when all workers are busy, the new request for job execution gets blocked. An edge computing system may send such jobs to the cloud (see Fig. 2, job ). There is a significant communication delay between the edge and the cloud, which may be much longer than the expected job computing time. Therefore, we want the system to serve more jobs and send fewer to the cloud.
II-B Redundancy and Straggler Mitigation
In an edge computing system, the job computing time is a crucial performance metric. However, task straggling is a fundamental problem in distributed systems that significantly affects system performance. To solve this problem, replication is an effective technique to introduce redundancy to mitigate stragglers. In Fig. 2, the controller node sends the and its copy to workers 1 and 2 separately. Compared to without redundancy, even though one worker processes the job for a long time, the controller node can still receive the result from the other worker. This problem is well studied in [28].
II-C Traditional Performance Metrics
There are two performance metrics of interest in the described system: 1) the job computing time and 2) the job blocking probability . The system’s goal is to minimize both of these numbers. The design parameters for a given system size (fixed number of workers) are the replication factor and the number of groups . Increasing and decreasing reduce (improves the first performance metric). However, the effect of and on is not immediately apparent. Increasing will temporarily occupy more servers per job, but will also make the jobs stay in the system for a shorter time.
II-D * Average Time and Service Rate as Performance Metrics*
In general, there may not be an optimal replication factor that simultaneously minimizes both and . We use the * average system time* to evaluate the tradeoff between these above two metrics. The average system time is defined as
[TABLE]
where is the completion time of the jobs blocked by the edge system and executed in the cloud.
Except for the average system time, we combine the above two metrics to get a new metric we refer to as the system service rate and define it as follows:
[TABLE]
We consider the average system time as the main performance metric to evaluate the tradeoff. However, according to the results in Sec. V, the average system time does not perform well when . Then, the system service rate is the better metric to evaluate the tradeoff.
II-E Job Arrival and Service Time
M/M/1 or M/M/n queues have been used in task allocation models in edge systems[30, 31, 32]. Here, we assume the job arrivals follow a Poisson process with a rate , which allows us to model the blocking system as an M/G/c/k queue. Analyzing the blocking probability of an M/G/c/k queue that models the blocking operation of the system in Fig. 2 is a highly complex problem. To better understand it, we first consider the Erlang B model, the M/M/c/c queue, where the queue length equals the number of groups. Then the service time follows the exponential distribution , which is a straightforward model and widely used in distributed computing[33, 34, 35]. Similarly, we will also consider the M/G/c/c queue with the shifted exponential service time. Shifted exponential distribution is also widely used in distributed computing[36, 34, 37]. Here, is an initial handshake time, after which the worker will complete the job in some time.
III Problem Statement and Contributions
[TABLE]
System parameters and notations are summarised in the above list. Our goal is to evaluate two performance metrics, the expected job computing time and the job blocking probability for systems with Poisson job arrivals and (shifted) exponential service time. We separately compute the number of groups (or the replication factor ) that minimizes and . When is small, the system concentrates limited computing resources on a few jobs. When is large, the system spreads the computing resources to more jobs. Apparently, the job will be processed faster with more computing resources and slower with less resources. To better evaluate and , we use the average system time and the system service rate to evaluate the tradeoff between these two metrics. We show how we can achieve the desired tradeoff between these two traditional metrics by selecting an appropriate .
We summarize the optimal strategy on the average system time in Table I. We conclude that it is better to spread computing resources to more jobs when the job arrival rate is moderate and the cloud time is large. Otherwise, we should consider to concentrate computing resources on a few jobs. When the computing resources are unlimited, it is always better to concentrate them on a few jobs for the exponential service time. However, we should find the balance between concentrating and spreading the computing resources for the shifted exponential service time.
We also summarize the optimal strategy on the system service rate in Table I for the scenario with large cloud time. We conclude that it is better to concentrate computing resources on a few jobs with exponential service time. We should spread computing resources to more jobs when the shifted parameter and job arrival rate are large. Otherwise, we should concentrate computing resources on a few jobs.
We here put the relevant results of [1], which focuses on analyzing the blocking system with blocking probability and system service rate. Further contributions include the following:
- •
We propose an edge computing system model as a combination of the distributed computing system and the blocking system. We aim to find the optimal number of groups (or replication factor) that minimizes the job computing time and the job blocking probability under different service time distributions. The theoretical and numerical results show that job computing time increases with decreasing replication, and the system blocking probability increases with the replication factor.
- •
We adopt the average system time to evaluate the tradeoff between job computing time and the system blocking probability. We analyze the optimal number of groups that maximizes the system service rate for different scenarios. We conclude that the optimal number of groups changes with the computing time distribution parameters and the arrival rate.
- •
We propose a new performance metric, the system service rate, to evaluate the tradeoff for the scenario when the time spent in the cloud is long. Our theoretical and numerical results show that concentrating limited computing resources on each job is better scheduling.
IV Computing Time and Blocking Probability
IV-A Job Computing Time
The job computing time measures how much time the job spends in the system occupying resources. We consider the system uses -fold replication, and the worker computing time follows (shifted) exponential distribution. When the worker computing time follows , the expected job computing time is given by
[TABLE]
Observe that reaches its minimum at . Moreover, follows the exponential distribution with the rate parameter . When the worker computing time follows , the expected job computing time is given by
[TABLE]
Observe that reaches its minimum at . Moreover, follows the shifted exponential distribution with the shift and the rate parameter . See, e.g., [38]. The above conclusions show that increasing the computing resource for each job will surely decrease the job computing time. Next, we want to explore if this strategy will perform well considering the blocking probability.
IV-B Job Blocking Probability
We adopt the (shifted) exponential distribution as a classical and simple service time model to analyze the edge computing system. Analyzing the blocking probability of an M/M/c/k (or M/G/c/k) queue that models the blocking operation of the system in Fig. 2 is a highly complex problem. To better understand it, we consider the Erlang B model, the M/M/c/c (or M/G/c/c) queue, where the queue length equals the number of groups. Then, for a blocking system with groups and the job arrives as a Poisson process with the rate , the job blocking probability is
[TABLE]
where . The above expression shows that for a given , increases with . From (3), we know that is a function of , in which . Then we will take . Thus, is a function of . When the worker computing time follows , we can rewrite (5) in the following,
[TABLE]
where is a constant. When the worker computing time follows , we can rewrite (5) in the following,
[TABLE]
According to (6), We find the optimal minimizes the job blocking probability in Theorem 1.
Theorem 1**.**
For the blocking system with Poisson() arrivals and service time, the job blocking probability increases with the number of groups and reaches the minimum at (i.e., ).
Proof.
Assume , from (6), the blocking probability is
[TABLE]
Since , reaches its minimum at . ∎
According to (7) and Theorem 1, we have the following lemma that finds the optimal minimizes the job blocking probability for shifted exponential service time.
Lemma 1**.**
For the blocking system with Poisson() arrivals and service time, the job blocking probability increases with the number of groups and reaches the minimum at (i.e., ).
Proof.
Similar to the proof of Theorem 1, assume , from (7), the blocking probability is
[TABLE]
Since , increases with the number of groups and reaches its minimum at . ∎
From Theorem 1 and Lemma 1, we conclude that spreading the computing resources to more jobs is the better way to support the system processing more jobs. Therefore, optimizing the blocking probability and the expected job computing time with the variable is a dilemma.
IV-C Numerical Analysis
We evaluate (3) and (6) for vs. and vs. in Fig. 3. Meanwhile, we evaluate (4) and (7) for vs. and vs. in Fig. 4. We consider a system with workers, the job arrives following the Poisson distribution with the rate .
In Fig. 3, the service time for each worker follows the exponential with . The left subfigure shows that increases linearly with , which means that introducing more redundancy provides a higher computing speed. The right subfigure shows that decreases with increasing , which means introducing more redundancy leads to more jobs sent to the cloud. We also observe that is close to [math] and decreases slowly when is large; when , decreases sharply. These observations are consistent with the theoretical analysis in Theorem 1. Meanwhile, We also conclude that some minimal replication significantly reduces computing time with almost no blocking probability change.
In Fig. 4, the service time for each worker follows the shifted exponential with and . We observe that increases linearly with and decreases with increasing . We also observe that is close to [math] only when is large enough, and it does not decrease sharply compared to the results in Fig.3. These observations are consistent with the theoretical analysis in Lemma 1.
The above two figures clearly show that minimizing the job computing time and the job blocking probability is impossible. Considering both job computing time and job blocking probability, we find that each metric requires a very different optimal . However, the job blocking probability may not always decrease with the number of groups for some heavy tail distributions, e.g., Pareto distribution. We will explore this problem in future work.
V Average System Time
In Section IV, we know that simultaneously minimizing the job computing time and the blocking probability is impossible. Interestingly, some minimal replication significantly reduces computing time with almost no blocking probability change. Therefore, we adopt the average system time, as a single performance indicator describing the tradeoff between the job computing time and the job blocking probability. For a system with workers and groups, the expression of the average system time is
[TABLE]
In this paper, we assume that the edge system can provide a better computing performance by reducing the communication time between the users and the cloud. When , it is easy to prove that . This means that the edge system does not have enough computing resources for each job. Then we have the following claim.
Claim 1**.**
If , the job should always be sent to the cloud.
From (3) (or (4)), the edge system will spend (or ) time to process the job with minimum computing resource, and (or ) time to process the job with a maximum computing resource. Therefore, Claim 1 holds for (shifted) exponential distributions. In the following, we will consider the scenarios when .
V-A Cloud Time:
When cloud time satisfies , it is faster to process the job in the edge system. Then, we should use the edge system to reduce the average system time. That is, the main purpose of this section is to find the optimal (or ) that minimizes the .
First, we consider the exponential service time. The cloud time condition is . According to Claim 1, when , the job should be sent to the cloud; otherwise, the job should be sent to the edge. That is, the optimal should be in the region . In the following theorem, we find the conditions that the average system time reaches the minimum at ().
Theorem 2**.**
For the edge system with Poisson() arrivals and computing time, when , the average system time reaches the minimum at () under the conditions or .
Proof.
When and , the average system time is
[TABLE]
Then, for any we have
[TABLE]
Let the above inequality holds, we can have
[TABLE]
Since , the inequality (9) holds when . As we know , then we have
[TABLE]
Apparently, the right-hand side increases with decreasing . Therefore, , that is, .
According to (5),
[TABLE]
Then the inequality (9) holds when . The right-hand side reaches the maximum at , therefore we have
[TABLE]
Therefore, reaches the minimum at () under the conditions that or . ∎
Next, we consider the shifted exponential service time. The cloud time condition is . Similarly, when , the job should be sent to the cloud; otherwise, the job should be sent to the edge. That is, the optimal should be in the region . According to Theorem 2, we can find the conditions that the average system time reaches the minimum at () in the following lemma.
Lemma 2**.**
For the edge system with Poisson() arrivals and computing time, when , the average system time reaches the minimum at () under the conditions or .
Proof.
When and , the average system time is
[TABLE]
Then, for any we have
[TABLE]
Since , then the above inequality holds when
[TABLE]
Similarly to the proof of Theorem 2, the above inequality holds when . Then we have
[TABLE]
Since . Therefore, , that is, .
Then we consider the second case.
[TABLE]
We have when . The right-hand side reaches the maximum at , therefore we have , that is, ∎
From Theorem 2 and Lemma 2, it is obvious that the optimal is determined by the ratio of and . When is small or large, it is better to concentrate the limited computing resources on a few jobs. Otherwise, the system can properly spread computing resources to more jobs. However, since the restriction of Claim 1, the optimal .
V-B * Cloud Time: *
First, we consider the exponential service time. The cloud time condition is . In this scenario, the edge system is always the first choice for each job. The job will only be sent to the cloud when the edge system is busy. Similarly to the conclusion of Theorem 2, the optimal that minimizes the average system time changes with different system parameters. However, we can still draw certain conclusions from the following theorem.
Theorem 3**.**
For the edge system with Poisson() arrivals and job computing time, when is sufficiently large, the average system time reaches the minimum at ().
Proof.
When and , the average system time is
[TABLE]
Then, for any we have
[TABLE]
Apparently, the right-hand side of the above inequality is finite. Therefore, reaches the minimum at () when is sufficiently large. ∎
Next, we consider the shifted exponential service time. The cloud time condition is . According to the proof of Theorem 3, we can also prove in the following lemma that the system should spread the computing resources to more jobs when is sufficiently large.
Lemma 3**.**
For the edge system with Poisson() arrivals and job computing time, when is sufficiently large, the average system time reaches the minimum at ().
Proof.
When and , the average system time is
[TABLE]
Then, for any we have
[TABLE]
Apparently, the right-hand side of the above inequality is finite. Therefore, reaches the minimum at () when is sufficiently large. ∎
It is not surprising to see the conclusion of Theorem 3 and Lemma 3. Because when is sufficiently large, minimizing is almost equal to minimizing . However, under this condition, the average system time may not be a good parameter to evaluate the tradeoff between the job computing time and the blocking probability. Therefore, we may consider another system parameter for this scenario in Section VII.
V-C Numerical Analysis
We evaluate (8) to see how the average system time changes with in a system with workers. We consider two service time models: the exponential service time with and the shifted exponential service time with and .
, the job arrives following the Poisson distribution with different values of , and the expected cloud time is smaller than the maximum job computing time.
In Fig. 5, we consider is smaller than the maximum job computing time and plot vs. for two service time models with . The upper graph shows when is small enough, reaches its minimum at , which is consistent with the result in Theorem 2. To decrease and increase the replication factor will significantly reduce . When or , we should slightly increases to achieve a lower . When is sufficiently large, reaches its minimum at again. However, the optimal value of does not provide a significant reduction of . We can observe similar results in the lower graph. Compared with the upper graph, we observe that increases significantly when is small (e.g., or ) and almost remains the same when is large (e.g., or ). From the results, we conclude that when fewer jobs arrive, almost all jobs can be processed in the edge system, so it is better to increase the computing speed. When more jobs arrive, the system should reduce the computing speed and provide the computing resources for more jobs. When there are too many jobs, the resource allocation strategy can not change the system performance significantly.
In Fig. 6, we consider the job arrives following the Poisson distribution with , and plot vs. for two service time models with . The upper graph shows when , reaches its minimum at ; When , reaches its minimum at ; When , reaches its minimum at . We also observe that the optimal increases with . We can observe similar results in the lower graph. Compared with the upper graph, we observe that can be optimal with a smaller value of . From the results, we have the following conclusions. Generally, we should balance the tradeoff between the job computing time and the blocking probability. However, when is sufficiently large, we should only focus on reducing the blocking probability and assigning resources for more jobs.
VI Average System Time with Large
Since the complexity of the expression of the average system time, it is difficult to find optimal and . The above section concludes that the optimal and change with the arrival rate and the parameters of or service time. However, it is not enough to analyze the overall performance of the system. In practice, some edge systems may have more computing resources. Therefore, it is valuable to analyze how the optimal and change when is large. In this section, we assume . Since , there are three possible scenarios:
- Scenario 1:
and is finite. 2. Scenario 2:
and is finite. 3. Scenario 3:
both and go to infinity.
Apparently, the above three scenarios can not exist simultaneously and only one of them provides the best performance considering the average system time . We will evaluate the performance of each scenario under different service time distributions in the following.
VI-A Analysis of Scenario 1
In Scenario 1, since is finite, there exists a constant that is always smaller than . First, we consider the exponential service time. We find the minimum average system time in the following theorem.
Theorem 4**.**
In the edge system with service time, as the number of workers goes to infinity, when the number of groups is finite, the average time reaches the minimum at [math].
Proof.
We adopt the expression of the average system time from (8) and replace with , then
[TABLE]
Since we assume that is finite, then we have
[TABLE]
Since is a constant,
[TABLE]
Therefore, when goes to infinity and is finite, the average system time will reduce to [math]. ∎
From Theorem 4, we know that the average system time can reach the minimum at [math]. Then we can conclude that the system with exponential service time can get the best performance by adopting Scenario 1.
Next, we consider the shifted exponential service time. According to the proof of Theorem 4, we find the minimum average system time for the shifted exponential service time in the following lemma.
Lemma 4**.**
In the edge system with service time, as the number of workers goes to infinity, when the number of groups is finite, the average time reaches the minimum at .
Proof.
We adopt the expression of the average system time from (8) and replace with , then
[TABLE]
Since we assume that is finite, then we have
[TABLE]
Since is a constant and ,
[TABLE]
∎
Compared to Theorem 4, we can not decide if the system with shifted exponential service time can reach the best performance by adopting Scenario 1.
VI-B Analysis of Scenario 2
In Scenario 2, when is finite, that is, there exists a constant that is always smaller than . First, we consider the exponential service time. We find the minimum average system time in the following theorem.
Theorem 5**.**
In the edge system with service time, as the number of workers goes to infinity, when the replication factor is finite, the average time reaches the minimum at .
Proof.
We adopt the expression of the average system time from (8), then
[TABLE]
Since we assume that the is finite, then we have
[TABLE]
Since is finite, is also finite. Then, we have
[TABLE]
Since is a constant,
[TABLE]
Therefore, when goes to infinity and reaches the maximum at , the average system time reaches the minimum at . ∎
Comparing the results of Theorem 4 and 5, we can conclude that the system will not adopt Scenario 2 considering the average system time.
Next, we consider the shifted exponential service time. According to the proof of Theorem 5, we find the minimum average system time for shifted-exponential service time in the following lemma.
Lemma 5**.**
In the edge system with service time, as the number of workers goes to infinity, when the replication factor is finite, the average time reaches the minimum at .
Proof.
We adopt the expression of the average system time from (8), then
[TABLE]
Since and are finite, is also finite. Then, we have
[TABLE]
and
[TABLE]
Thus,
[TABLE]
Since is a constant and ,
[TABLE]
∎
Compared to the results of Lemma 4, we can not easily decide which is smaller. Thus, we will make an overall comparison after calculating the average system time with Scenario 3.
VI-C Analysis of Scenario 3
In Scenario 3, the replication factor and the number of groups go to infinity. First, we consider the exponential service time. We find the minimum average system time with Scenario 3 in the following theorem.
Theorem 6**.**
In the edge system with service time, when the replication factor and the number of groups go to infinity, the average time reaches the minimum at [math].
Proof.
We adopt the expression of the average system time from (8), then
[TABLE]
First, since all the parameters are positive, we have
[TABLE]
Then we have
[TABLE]
It is obvious that
[TABLE]
Thus, we have
[TABLE]
That is,
[TABLE]
Therefore, when and go to infinity, the average system time reaches the minimum at [math]. ∎
From Theorem 6, we find that can also reach the minimum at [math]. Then we conclude that the system can get the best performance by adopting Scenario 3. Compared to the results of Theorem 4, we find that it does not matter whether the number of groups is finite or infinite. To achieve the best system performance, we only need to let the replication factor increase with .
Next, we consider the shifted exponential service time. According to the proof of Theorem 6, we find the minimum average system time for the shifted exponential service time in the following lemma.
Lemma 6**.**
In the edge system with service time, when the replication factor and the number of groups go to infinity, the average time reaches the minimum at .
Proof.
We adopt the expression of the average system time from (8), then
[TABLE]
First, since all the parameters are positive, we have
[TABLE]
Then we have
[TABLE]
As goes to infinity, increases much faster than . Thus, we have
[TABLE]
That is,
[TABLE]
∎
Finally, we can compare the minimum values of from Lemma 4,5 and 6, it is obvious that the minimum average system time with Scenario 3 reaches the overall minimum at . Therefore, to reach the best system performance, we should let both the replication factor and the number of groups increase with .
VI-D * Numerical Analysis*
We evaluate (3) and (8) to find the optimal and minimize the average system time. Then we separately plot the optimal vs. and the optimal vs. for exponential service time in Fig. 7. Similarly, we separately plot the optimal vs. and the optimal vs. for shifted exponential service time in Fig. 8 by evaluating (4) and (8). We consider a system with workers, the job arrives following the Poisson distribution with the rate , the expected time a job spends in the cloud is . Since and are integers, we assume . Although the actual values of may become small, it does not affect the conclusions of the numerical analysis.
In Fig. 7, the service time for each worker follows the exponential with . We observe that when is small (e.g., ), the optimal increases with ; when becomes large (e.g., ), the optimal decreases with increasing . We also observe that the optimal always increases with except for the scenario when is small. These observations are consistent with the theoretical analysis of Theorem 4,5 and 6.
In Fig. 8, the service time for each worker follows the shifted exponential with and . We observe that the optimal increases sharply with when is small and increases smoothly with when is large. Notice that the curve has some fluctuations when is large. This is because , and are all integers with the relation , thus optimal can not take every integer value. We also observe that when is small (e.g., ), the optimal decreases with increasing ; when becomes large (e.g., ), the optimal increases with .
VII System Service Rate
From Theorem 3 and Lemma 3, we know that the average system time may not be a good performance metric to evaluate the tradeoff between the job computing time and the blocking probability when is large. In this section, we provide the system service rate as another performance metric. This metric is adopted in [16] to evaluate the average computing speed. We will analyze the optimal and that maximize with different service times. For a system with workers and groups, the expression of the system service rate is
[TABLE]
VII-A Exponential Service Time
From (10), considering the job completion distribution with -fold replication as and , the system service rate is
[TABLE]
The following theorem gives the optimal number of groups .
Theorem 7**.**
For the blocking system with Poisson() arrivals and computing time, the system service rate reaches the maximum at (i.e., ).
Proof.
From (11), we know that . Assume that ,
[TABLE]
First, we consider the left-hand side of the above formula. Since we have . Second, we consider the right-hand side of . Since , we have
[TABLE]
Then we have,
[TABLE]
Therefore, we have the result that the system service rate reaches the maximum at . ∎
VII-A1 Numerical Analysis
We evaluate (11) for different values of . Since it is easy to know that decreases with the increasing according to (5) and (11). We plot the normalized vs. in Fig. 9 to see the changes in the system service rate. We consider a system with workers and the service time for each worker follows the exponential with .
The figure shows that the system service rate always decreases with the increasing and reaches the maximum at . This observation is consistent with the conclusion of Theorem 7. When comparing the curves and , we know that decreases relatively smoother when the arrival rate is larger. When the arrival rate is sufficiently large (e.g., ), decreases sharply.
VII-B Shifted-Exponential Service Time
For a blocking system with workers and groups, when the job completion distribution with -fold replication is , the expression of the system service rate is
[TABLE]
We know that the optimal that maximizes the system service rate changes with different parameters. However, considering the complexity of (12), optimizing under a general shifted exponential service time is difficult. In the following, we separately analyze two important system parameters: the shift parameter and the job arrival rate. From the analysis, we want to know how the optimal changes with different values of the system parameters.
VII-B1 Shift Parameter
We analyze two special parameter regions: 1) , which makes exponentially distributed, and 2) , which makes equal to constant . Apparently, the first case shows that the random part is much larger than the constant part , and the second case is on the contrary.
In the first case, the job completion distribution with -fold replication is . According to Theorem 7, the system service rate reaches the maximum at . That is, when the random part is much larger than the constant part, it is better to decrease the number of groups to achieve a larger system service time. Next, we analyze the second case where the computing time is approximately constant. Then is approximated by the constant . The system service rate is . According to Theorem 1, increases with . Therefore, increases with and reaches the maximum at .
The above analysis implies that the optimal that maximizes the system service rate lies between and for a general value of the shift parameter . When the random part is larger, the optimal is smaller and we should concentrate the computing resource on a few jobs; when the constant part becomes larger, the optimal also becomes larger and we should spread the resource to more jobs.
VII-B2 Job Arrival Rate
The arrival rate is also an important parameter that affects the system service rate. According to the system structure, we may infer that the system service rate changes as follows. When is small, few jobs arrive. Even with a large replication factor, the blocking system can serve almost all jobs. Thus, introducing more redundancy may improve the system performance. When is large, the system has to handle many jobs concurrently. It is, thus, reasonable to decrease the replication factor to process more jobs in the system.
In Theorem 8, we verify the first scenario in which is sufficiently small.
Theorem 8**.**
For the blocking system with Poisson() arrivals and computing time, the system service rate always reaches the maximum at when the arrival rate .
Proof.
When the replication factor , the number of groups . From Eq.(12), we have , where . When , we have .
To satisfy , we need
[TABLE]
That is,
[TABLE]
Since , the right-hand side reaches the minimum at . Therefore, holds for any .
∎
Next, we consider the scenario where is sufficiently large. Using a method similar to the proof of Theorem 8, we find the optimal in the following lemma.
Lemma 7**.**
When the arrival rate (where ), the system service rate always reaches the maximum at .
Proof.
From Eq.(12), we have and , where and .
For , we have
[TABLE]
To satisfy , we need
[TABLE]
Then we have,
[TABLE]
When , we have . Then it is clear that holds. ∎
Generally speaking, decreasing the number of groups and increasing the replication factor provides better performance when the arrival rate is low. When the arrival rate is high, it is better to increase the number of groups. However, when the arrival rate is sufficiently high, the number of groups and the replication factor do not affect the system service rate.
VII-B3 Numerical Analysis
We evaluate to see how the system service rate changes with the number of groups . We consider a system with workers, the job arrives following the Poisson distribution with the rate , and the service time for each worker follows the shifted exponential distribution. Since the replication factor (Since both and are integers, we have ). We assume .
In Fig. 10, we evaluate (12) for three different values of . We normalize to observe the changes. When , reaches the maximum at , which means introducing more redundancy provides a higher system service rate. When , reaches the maximum at , which means that reducing proper redundancy provides a higher system service rate. Otherwise, the optimal lies between and (the optimal in Fig. 10).
We evaluate to see how the system service rate changes with the number of groups . We consider a system with workers. Jobs arrivals follow the Poisson distribution, and the worker computing time follows the shifted exponential distribution with the shift and the rate .
In Fig. 11, we evaluate (12) for three different values of . We normalize to observe the changes. When is sufficiently small, reaches the maximum at . When is sufficiently large, reaches the maximum at . Otherwise, the optimal lies between and (the optimal in Fig. 11). Therefore, we conclude that the optimal increases with .
VIII Conclusion and Future Directions
We consider an edge computing system with limited storage and computing resources in which the arrival jobs will be sent to the cloud when the system is busy. We addressed the problems concerning the number of groups that optimize the job computing time and the job blocking probability. We find it impossible to simultaneously minimize the job computing time and the job blocking probability for both exponential and shifted exponential service times. Therefore, we use the average system time to evaluate the tradeoff between the job computing time and the job blocking probability. We find that the optimal number of groups minimizing the average system time changes with the job arrival rate and cloud time parameters. We also analyze the average system time when the computing resources are unlimited. We show that we need to concentrate the computing resources on a few jobs for the exponential service time to achieve the minimum average system time. However, we should balance concentrating and spreading the computing resources for the shifted exponential service time.
When the cloud time is significant, the average system time cannot perform well as the tradeoff between the job computing time and the job blocking probability. Then, we introduce the system service rate as a new combined metric and a single system performance indicator. We find that the system should always concentrate the computing resources on a few jobs for the exponential service time to achieve the maximum system service rate. However, the optimal number of groups changes with the shift parameter and the job arrival rate for the shifted exponential service time. This work sets the stage for many problems of interest to be studied in the future. We briefly describe three directions of immediate interest.
VIII-1 Computing Time and Blocking Probability Tradeoff for M/G/c/k Queues
As we mentioned in Sec. IV-B, our edge computing system is generally modeled as an M/G/c/k queue, where the capacity of the queue . When all servers are busy, the new arrival job will wait in the limited-length queue, and the job will be blocked when the queue gets fully occupied. In this model, when we concentrate the computing resources on a few jobs to obtain a smaller computing time, the queue’s capacity does not decrease linearly. Thus, compared to the M/G/c/c queue results, spreading resources may not be the best strategy considering the job blocking probability.
VIII-2 Computing Time and Blocking Probability Tradeoff for Other Service Time Models
We analyzed the most common service time models. Other distributions, e.g., Pareto, bimodal, and Weibull, are also interesting. The optimal number of groups may behave differently for light and heavy-tailed distributions since their computing cost vs. time tradeoffs are qualitatively different [27].
VIII-3 System Service Rate for Blocking Systems
System service rate is a good performance metric to evaluate the distributed system[15]. We may analyze this metric for a general blocking system in the future. The blocked job will be dropped rather than sent to the cloud in such a system. If the users require every computing result, it is similar to the system in Sec. VII. If not, the users will send the request to the system again, increasing the job arrival rate. Analyzing job computing time and blocking probability under such schemes is open.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Peng and E. Soljanin, “Computing redundancy in blocking systems: Fast service or no service,” in Fifteenth International Conference on Wireless Communications and Signal Processing (WCSP) , 2023.
- 2[2] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision and challenges,” IEEE Internet of Things J. , vol. 3, pp. 637–646, 2016.
- 3[3] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: The communication perspective,” IEEE communications surveys & tutorials , vol. 19, pp. 2322–2358, 2017.
- 4[4] H. Li, K. Ota, and M. Dong, “Learning iot in edge: Deep learning for the internet of things with edge computing,” IEEE network , vol. 32, no. 1, pp. 96–101, 2018.
- 5[5] J. Hochstetler, R. Padidela, Q. Chen, Q. Yang, and S. Fu, “Embedded deep learning for vehicular edge computing,” in 2018 IEEE/ACM Symposium on Edge Computing (SEC) . IEEE, 2018, pp. 341–343.
- 6[6] B. Li, P. Chen, H. Liu, W. Guo, X. Cao, J. Du, C. Zhao, and J. Zhang, “Random sketch learning for deep neural networks in edge computing,” Nature Computational Science , vol. 1, no. 3, pp. 221–228, 2021.
- 7[7] C. C. Byers, “Architectural imperatives for fog computing: Use cases, requirements, and architectural techniques for fog-enabled iot networks,” IEEE Communications Magazine , vol. 55, no. 8, pp. 14–20, 2017.
- 8[8] S. Yi, Z. Hao, Z. Qin, and Q. Li, “Fog computing: Platform and applications,” in 2015 Third IEEE workshop on hot topics in web systems and technologies (Hot Web) . IEEE, 2015, pp. 73–78.
