The More the Merrier: Enhancing Reliability of 5G Communication Services with Guaranteed Delay
Prabhu Kaliyammal Thiruvasagam, Vijeth J Kotagi, and C Siva Ram Murthy

TL;DR
This paper proposes eRESERV, a novel solution to improve the reliability of service chains in 5G networks, addressing challenges posed by network softwarization and ensuring service level agreements.
Contribution
The paper introduces eRESERV, a new method to enhance 5G service chain reliability amid softwarization challenges, ensuring SLA compliance.
Findings
eRESERV significantly improves service chain reliability in 5G networks.
The approach effectively meets service level agreements under failure conditions.
Enhanced reliability reduces service disruptions and improves user experience.
Abstract
Although network functions virtualization and software-defined networking offer many dynamic features such as flexibility, scalability, and programmability for easy provisioning of services at a lesser cost and time through service function chaining, it introduces new challenges in terms of reliability, availability, and latency of services. Particularly, softwarization of network and service functions (e.g., virtualization, anything as a service, dynamic virtual chaining, and routing) impose high possibility of network failures due to software issues than hardware. In this letter, we propose a novel solution called eRESERV to enhance the reliability of service chains in 5G while meeting the service level agreements.
| Parameters | Values |
|---|---|
| Arrival rate, | 100 |
| Serving rate of VNFs, | 200 |
| Maximum allowed packet delay, | 0.125 seconds |
| Reliability rate of VNFs, | 0.9 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The More the Merrier: Enhancing Reliability of 5G Communication Services with Guaranteed Delay
Prabhu Kaliyammal Thiruvasagam, Vijeth J Kotagi, and C Siva Ram Murthy
Indian Institute of Technology Madras, Chennai 600036, India
[email protected], [email protected], [email protected]
Abstract
Although network functions virtualization and software-defined networking offer many dynamic features such as flexibility, scalability, and programmability for easy provisioning of services at a lesser cost and time through service function chaining, it introduces new challenges in terms of reliability, availability, and latency of services. Particularly, softwarization of network and service functions (e.g., virtualization, anything as a service, dynamic virtual chaining, and routing) impose high possibility of network failures due to software issues than hardware. In this letter, we propose a novel solution called eRESERV††This work was supported by the Department of Science and Technology, New Delhi, India. to enhance the reliability of service chains in 5G while meeting the service level agreements.
Index Terms:
5G network, Communication service, Virtual network function, Service function chaining, Reliability, Resource management, Service level agreement.
I Introduction
Softwarization in 5G networks to support services such as enhanced mobile broadband and ultra-reliable low-latency communications has revolutionized the networking industry. It is expected that 5G networks will meet the stringent requirements of communication services and business models of 2020 and beyond [1]. Network Functions Virtualization (NFV) and Software-Defined Networking (SDN) are considered as the key technology enablers for softwarization in 5G networks [2].
NFV allows network functions (or middleboxes) to run as software modules on commercial-off-the-shelf servers rather than running on specialized hardware appliances. These virtualized software modules are called as Virtual Network Functions (VNFs). NFV leverages virtualization, cloud computing, and SDN technologies to provide anything as a service (e.g., core network as a service, security as a service, etc.) dynamically over the network, and reduces capital expenditures and operational expenditures.
Traditionally, network/communication services are provided through one or more network functions to deliver an end-to-end (e2e) service. Service Function Chaining (SFC) or simply service chain involves instantiation of an ordered list of network/service functions (e.g., firewall, IDS, and proxy), and connecting them together as a chain of network functions to provide the e2e services [3]. NFV facilitates easy provisioning of services by dynamically placing VNFs in the virtual environment and chaining them together as an SFC. As a 5G network aims at providing services to diverse industry verticals, tens to hundreds of VNFs are placed on a set of servers and chained to create multiple SFCs.
Although NFV and SDN provide many benefits in terms of cost reduction and flexible management of resources to dynamically provide diverse services, they create avenues for reliability, availability, and latency related issues as they are prone to software failures. Particularly, softwarization of network and service functions impose high possibility of network failures due to software issues than hardware. For instance, failure of a VNF or a virtual link in an SFC will bring down the entire chain and disrupt the service which may result in customer dissatisfaction and revenue loss. Failures may happen both at the substrate network and virtual network, but the frequency of failures at virtual network is higher than substrate networks [4]. Generally, failures in virtual network may happen due to software bugs, API failures, incorrect specifications, network design flaws, improper testings, and network operator errors.
Another important aspect of NFV is meeting Service Level Agreements (SLAs) in terms of delay and resources of all services. A common approach to achieve higher reliability and meeting delay constraints is by placing redundant network elements (also called as backup) [5][6][7]. However, placing redundant network elements are expensive and ineffective in terms of effective utilization of available resources.
Therefore, in this letter, we propose a novel solution dubbed enhancing REliability of SERVice chain (eRESERV), which enhances the reliability of an SFC while meeting the delay constraints of the SFC. The proposed solution also minimizes the resource requirement of an SFC to achieve higher reliability without compromising the delay constraints. By extensive simulations we show the effectiveness of the proposed solution in terms of reliability, expected response time, and resource requirement when compared to traditional backup settings. We further analyze our solution using queuing theory by modeling an SFC as M/M/1 and M/M/m tandem network of queues.
II Network Model
We represent the physical/substrate network as a graph , where represents the set of physical nodes and represents the set of physical links. Physical resources are virtualized to create virtual networks and controlled with the help of hypervisor (or virtual machine manager) and SDN controller. SFCs are created at the virtual network in the cloud data center to provide service for various network service requests. We represent the virtual network as a graph , where represents a set of VNFs (e.g., load balancer, firewall, intrusion detection system, proxy, mobility management entity, serving/packet gateway, home subscriber server, etc.) and represents a set of virtual links in the system. VNFs are hosted on the physical servers and virtual links are created to interconnect VNFs and carry virtual network traffic over the physical links. The physical and virtual network resources together form NFV infrastructure (NFVI), which is managed and controlled by Virtual Infrastructure Manager (VIM) along with SDN controller.
We assume that service providers offer a finite set of services using SFCs. Let the set of all SFCs provided by a service provider be denoted by . Each SFC provides a particular service and is represented as an acyclic directed graph , where and represent the set of VNFs in sequential order and the set of links that interconnect these VNFs, respectively. For example, consider a web service request , where the set of VNFs required to cater the service in an order are firewall, intrusion detection system, and proxy. As each SFC provides a particular service, we use terms SFC and service interchangeably.
In this letter, we consider that any service request has a latency requirement denoted by , and is considered to have an arrival rate which follows Poisson distribution. Each VNF is considered to have a processing rate of and follows an exponential distribution with corresponding response (both waiting and processing) time .
Now our aim is to design an SFC to provide service for a service request such that the SFC offers high reliability, meets the delay constraint , and request is satisfied with the minimal resources. In this letter, we consider number of virtual cores (directly relates to ), assigned to all VNFs in to process the traffic of requested service, as the resources.
III Enhancing Reliability of an SFC
Consider an SFC as shown in Fig. 1, which requires four VNFs, i.e., and has arrival rate . Each VNF is reliable with probability . If an SFC provides a service for a service request which has delay constraint of , then . The resource requirement of is . If each VNF is reliable with a probability , then the reliability can be calulated as,
[TABLE]
Now consider a backup setting where each VNF is provided with a dedicated number of backups. Fig. 2 shows one example where each VNF is provided with one backup. Now, reliability of each VNF can be calculated as,
[TABLE]
Now, by substituting above equation in Equation 1, the new reliability of the SFC is,
[TABLE]
Clearly, . However, the number of resources now required for the SFC will be
[TABLE]
Although assigning redundant backup resources increases the reliability of service chains, this approach is inefficient with respect to efficient utilization of resources. The redundant backup resources are idle until a failure happens in the primary nodes or links. Also, since failure may happen randomly at any point of time, assigned redundant backup resources cannot be used for any other purposes. In this letter, we propose a novel solution called eRESERV which enhances the reliability of service chains with less additional resources rather than assigning redundant backup resources.
Here, instead of providing additional backups to VNFs in an SFC, in this letter, we propose to divide an SFC into multiple subchains of SFC with lower capacity VNFs to increase the reliability.
Theorem 1**.**
Dividing an SFC into subchains of SFC with VNFs of lesser capacity will increase the reliability of the system.
Proof.
Consider an SFC with arrival rate is divided into subchains with each VNF having processing rate of and each subchain having an arrival rate of as shown in Fig. 3. However, as the reduced capacity/processing rate VNFs are performing the same software functionality as that of original VNF, the reliability of each VNF is still . Let each subchain of be represented by . Now, the reliability of each subchain can be calculated as,
[TABLE]
However, for the system to be reliable, at least one of the subchains must be active. Therefore, reliability of the system can now be calculated as,
[TABLE]
Differentiating the above equation with respect to , we get
[TABLE]
As is an increasing function with respect to and there do not exist any extreme points for all , we can say that increases with increase in . ∎
*Discussion: *
Although reliability of an SFC increases with increase in , the reliability of this system is lesser than the backup setting. 2. 2.
However, the number of resources now required for an SFC will not increase, i.e.,
III-A Analysing Delay
From Theorem 1, it is clear that, if an SFC is divided into multiple subchains, the reliability of the system increases. However, the new system with subchains must also meet the latency requirement of the SFC . To understand the effect of subchaining of an SFC on latency, we model subchaining of the SFC as a tandem of M/M/1 queuing network. By Burke’s theorem [8], the arrival rates for all M/M/1 queues in the tandem of M/M/1 queuing network are same (refer Fig. 3).
Here, we model every VNF in a chain/subchain as an M/M/1 queue. The response time of the VNF can be calculated as,
[TABLE]
where, is the arrival rate to the VNF .
Consider an SFC with number of VNFs. By Burke’s theorem, . The expected response time of the SFC can be calculated as,
[TABLE]
Now, consider an SFC which is divided into subchains. As every packet traverses one of the subchains, say , the expected response time of the the SFC can be calculated as,
[TABLE]
Clearly, Equation 10 is times of Equation 9, showing that the response time of an SFC with subchains increases linearly when compared to original SFC without subchains.
III-B Identifying Number of Subchains
As every SFC has the delay constraint of , the delay incurred in an SFC with subchains should not exceed . Therefore,
[TABLE]
III-C Decreasing Response Time
In the previous subsection, we saw that the expected response time is linearly increasing with the number of subchains . In this subsection we propose an alternate way to improve the response time. Here, we propose to have a common scheduler for every VNF as shown in Fig. 4 instead of having an individual scheduler for each VNF as in Fig. 3. Now, the new system can be modeled as an M/M/m queuing system. Now, if a VNF is divided into smaller VNFs, then the expected response time of a VNF can be calculated as,
[TABLE]
Now, the expected response time of an SFC modeled as a tandem of M/M/m system can be calculated as,
[TABLE]
III-D Analyzing Reliability and Estimating l
Here, the system will be active if any one of the smaller VNFs is active at every VNF of an original SFC. Therefore, the reliability of the new M/M/ system is given by,
[TABLE]
As and , can be calculated as,
[TABLE]
M/M/1 setting would be preferred in resource constrained systems and environments, and it reduces the migration delay as well when compared to M/M/m setting. Algorithm 1 gives the framework to identify the number of subchains in M/M/1 and M/M/m setting. If the preferred setting is M/M/1, then the number of subchains can be calculated in time. If the maximum number of subchains that can be made is , then the number of subchains in M/M/m setting can be calculated in time using binary search.
IV Performance Analysis
In this section, we evaluate the performance of our proposed solution methods presented in the previous section. Simulation parameters considered in our simulation are shown in Table I. Simulation results are obtained using discrete-event simulator MATLAB Simulink. We compare our proposed eRESERV M/M/1 and M/M/m settings with i) single SFC chain (SC) setting where there is one service chain for every service and ii) backup (SCB) setting where there is a dedicated backup for every VNF in an SFC. We compare our results in terms of reliability, expected response time, and number of resources required for an SFC. Note that, in the proposed M/M/1 and M/M/m settings, we first identify the number of subchains that can be created and then measure the performance metrics.
Fig. 5(a) shows the effect of number of subchains created on the reliability. In SC setting, the reliability is always constant, however in SCB the reliability increases with increase in the number of backups. In any case, the reliability in eRESERV settings increases with increase in number of subchains (hence the title, “the more the merrier”). M/M/m setting matches the reliability offered by the SCB setting, but consumes way less resources when compared to SCB setting as shown in Fig. 5(b).
Fig. 5(c) shows the number of subchains created by our proposed algorithm when the number of VNFs is varied in an SFC. As it can be seen, the number of subchains created decreases with an increase in the number of VNFs in an SFC in both the settings. However, the number of subchains created is more in M/M/m setting than in any other settings. This is due to the fact that, the expected response time in M/M/m setting is lesser than in M/M/1 setting when the number of subchains is increased as shown in Fig. 5(d).
Fig. 5(e) shows the effect of number of VNFs on the expected response time. As it can be seen, the proposed settings are able to meet the delay constraint at every instance of time. Although the response time in the proposed settings is higher than in SC and SCB settings, the reliability is the highest in M/M/m setting (with minimum amount of resources) when compared to all other settings as shown in Fig. 5(f).
V Conclusion
In this letter, we proposed a novel solution called eRESERV for enhancing the reliability of an SFC in 5G communication services. We first proposed utilization of subchains to enhance the reliability and decrease the amount of resource consumption. We analyzed this setting by modeling a VNF as an M/M/1 queue. Furthermore, to decrease the expected response time, we proposed a common scheduler for every VNF which was modeled as an M/M/m queue. Using queuing theory, we identified the number of subchains that can be created without violating the service delay constraints. By extensive simulations we showed the effectiveness of our proposed settings in terms of reliability, expected response time, and the amount of resources requested.
In this letter, we considered that the substrate network is completely reliable. The placement of the proposed VNFs in an unreliable substrate network is an interesting study which we plan to pursue in our future work.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] NGMN, “5G Extreme Requirements: End-to-End Considerations,” Aug. 2018.
- 2[2] F. Z. Yousaf et al. , “NFV and SDN - Key Technology Enablers for 5G Networks,” IEEE Journal on Selected Areas in Communications , vol. 35, no. 11, pp. 2468–2478, Nov. 2017.
- 3[3] W. Haeffner et al. , “Service Function Chaining Use Cases in Mobile Networks,” Internet Engineering Task Force, Internet-Draft draft-haeffner-sfc-use-case-mobility-00, Jan. 2014.
- 4[4] K. Benz, “VM Reliability Tester: A tool for measuring cloud reliability of Open Stack V Ms using Python,” Ph.D. dissertation, July 2015.
- 5[5] M. R. Rahman et al. , “SVNE: Survivable Virtual Network Embedding Algorithms for Network Virtualization,” IEEE Transactions on Network and Service Management , vol. 10, no. 2, pp. 105–118, June 2013.
- 6[6] L. Qu et al. , “Reliability-Aware Service Chaining In Carrier-Grade Softwarized Networks,” IEEE Journal on Selected Areas in Communications , vol. 36, no. 3, pp. 558–573, Mar. 2018.
- 7[7] J. Sun et al. , “A Reliability-Aware Approach for Resource Efficient Virtual Network Function Deployment,” IEEE Access , vol. 6, pp. 18 238–18 250, 2018.
- 8[8] P. J. Burke, “The Output of a Queuing System,” Oper. Res. , vol. 4, no. 6, pp. 699–704, Dec. 1956.
