QoE-Aware Resource Allocation for Crowdsourced Live Streaming: A Machine Learning Approach
Fatima Haouari, Emna Baccour, Aiman Erbad, Amr Mohamed, and Mohsen, Guizani

TL;DR
This paper proposes a machine learning-based framework for predicting viewer distribution in crowdsourced live streaming to optimize resource allocation, improve QoE, and reduce costs through proactive, geo-distributed cloud infrastructure management.
Contribution
It introduces a prediction-driven resource allocation approach that leverages viewer location data to enhance QoE and cost-efficiency in live streaming services.
Findings
Predicted viewer numbers closely match actual data.
Optimized resource allocation reduces access delay.
Trade-off analysis between delay and cost.
Abstract
Driven by the tremendous technological advancement of personal devices and the prevalence of wireless mobile network accesses, the world has witnessed an explosion in crowdsourced live streaming. Ensuring a better viewers quality of experience (QoE) is the key to maximize the audiences number and increase streaming providers' profits. This can be achieved by advocating a geo-distributed cloud infrastructure to allocate the multimedia resources as close as possible to viewers, in order to minimize the access delay and video stalls. Moreover, allocating the exact needed resources beforehand avoids over-provisioning, which may lead to significant costs by the service providers. In the contrary, under-provisioning might cause significant delays to the viewers. In this paper, we introduce a prediction driven resource allocation framework, to maximize the QoE of viewers and minimize the…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 28
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 34
Figure 35
Figure 36
Figure 37
Figure 38
Figure 39
Figure 40| Notation | Description |
| Set of incoming live videos at period | |
| Set of regions | |
| Set of broadcasters regions for videos at period | |
| Region of video allocation | |
| Region of serving | |
| Region of broadcasting | |
| Set of predicted viewers for live videos at period | |
| Set of predicted viewers at different for video | |
| Set of storage used at each region | |
| Binary decision variable that indicates the serving site | |
| Binary decision variable that indicates the allocation site | |
| Binary variable that indicates viewers existence | |
| Round trip delay between and | |
| Matrix for round trip delay between the different | |
| Delay threshold | |
| Video size | |
| Storage cost per GB at region | |
| Migration cost per GB from broadcaster region | |
| Serving request cost per GB from | |
| Total storage cost | |
| Total migration cost | |
| Total serving request cost | |
| Overall cost |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
QoE-Aware Resource Allocation for Crowdsourced Live Streaming: A Machine Learning Approach
Fatima Haouari, Emna Baccour, Aiman Erbad, Amr Mohamed, and Mohsen Guizani
CSE department, College of Engineering, Qatar University
Abstract
Driven by the tremendous technological advancement of personal devices and the prevalence of wireless mobile network accesses, the world has witnessed an explosion in crowdsourced live streaming. Ensuring a better viewers quality of experience (QoE) is the key to maximize the audiences number and increase streaming providers’ profits. This can be achieved by advocating a geo-distributed cloud infrastructure to allocate the multimedia resources as close as possible to viewers, in order to minimize the access delay and video stalls. Moreover, allocating the exact needed resources beforehand avoids over-provisioning, which may lead to significant costs by the service providers. In the contrary, under-provisioning might cause significant delays to the viewers. In this paper, we introduce a prediction driven resource allocation framework, to maximize the QoE of viewers and minimize the resource allocation cost. First, by exploiting the viewers locations available in our unique dataset, we implement a machine learning model to predict the viewers number near each geo-distributed cloud site. Second, based on the predicted results that showed to be close to the actual values, we formulate an optimization problem to proactively allocate resources at the viewers proximity. Additionally, we will present a trade-off between the video access delay and the cost of resource allocation.
Index Terms:
QoE, Crowdsourced live video, Resource allocation, Cloud computing, Machine learning.
I Introduction
Crowdsourced live video streaming is on the rise, and it continues to grow every single day. As per Cisco mobile video traffic statistics, mobile video content is predicted to present 82% of the global Internet traffic in 2021 as opposed to 73% in 2016 [1]. The rise in popularity of crowdsourced live streaming can be attributed to technological advancement, proliferation of smartphones and wireless network availability, which have led crowdsourcers to broadcast their live videos to various content providers. One of the most popular live streaming platform is Facebook, which had 2.19 billion active users per month in the first quarter of 2018 [2]. As per [3] 78% of Facebook online users are watching live videos, and 1 out of 5 videos on Facebook is live.
The industry and academia have shown an overwhelming interest in crowdsourced streaming recently in terms of achieving the best QoE as it is the key to increase the audiences number and the content providers’ revenues. A series of recent studies have been conducted to determine the main factors that affect the viewers’ QoE [4, 5]. These studies revealed that viewers QoE is primarily dependent on two factors: First, the video startup delay and playback buffering stalls and second, the video quality which depends on the viewers’ internet connectivity quality and available video representations. The authors in [5] highlighted that the higher the startup delay is, the more the viewers abandonment increases. They also showed that viewers who experienced low QoE are less likely to revisit the content provider’s application within a specific period of time. Therefore, video startup and rebuffering delays have high impact on viewers’ QoE. However, the challenge is to serve the viewers with the best QoE possible, while minimizing the cost of resource allocation.
Geo-distributed clouds are proposed to enhance the QoE. In this context, many efforts are working on presenting an efficient resource allocation by proposing heuristics and optimisations. Wu et al. [6] formulated an optimal viewing request distribution in the geo-distributed clouds, they predicted users future demands based on their social influences using an epidemic model. He et al. [7] presented a resource allocation framework to allocate geo-distributed cloud service to crowdsourcers for transcoding and serving viewers. K. Bilal et al. [8] presented a QoE-aware resource allocation optimization for crowdsourced multiview live streaming to choose the optimal transcoding cloud site location, and the optimal set of video representations. The drawback of these traditional algorithms is the near optimal solutions they provide. They lack the ability to allocate the exact resources needed beforehand. This may either lead to over-provisioning of resources that may incur significant costs to the service providers, or under-provisioning of resources that may cause delays to the viewers. Therefore, addressing such a trade-off proactively is a real challenge that requires some accurate prediction techniques.
In this work, we are addressing the proactive resource allocation by adopting machine learning techniques for designing a predictive model for the viewers’ locations. In particular, we consider predicting the number of viewers near each geo-distributed cloud site for each incoming live video, in order to proactively allocate resources at the proximity of the viewers. To the best of our knowledge, there is no research work that applied machine learning techniques for resource allocation to maximize QoE and minimize the cost. Only a few studies adopted machine learning to improve the viewers QoE, with their focus varies from dealing with the buffering and the bitrate selection [9], to determining Adaptive Bitrate (ABR) best parameters in order to improve adaptive video streaming [10]. The authors in [9] proposed a video freeze predictive model to detect possible factors that lead to video stalling at the viewers side. A recent study by [10] proposed using decision trees to choose the best ABR parameters to improve the adaptive video streaming. Moreover, few recent studies have used machine learning for predicting the viewers’ QoE. The authors in [11] predicted the users engagement score, by considering users engagement as a function of Quality of Service (QoS) factors and viewers preferences. Another work in [4] proposed a classification model for users engagement, where users engagement was quantified in terms of users number of visits and video watching time.
The contributions of this paper are summarized as:
- •
Using our collected Facebook 2018 live videos dataset [12] containing records of viewers’ locations for each video, we develop a regressive model using machine learning techniques that predicts the number of viewers near different geo-distributed cloud sites for each incoming live video.
- •
To serve the predicted viewers such that they experience the minimum startup delay with a minimal cost to the content provider, we formulate an optimization problem for allocating resources as near as possible to the viewers.
The rest of this paper is organized as follows: Section II presents our system model composed of: (1) viewers predictive model; (2) proactive resource allocation optimizer. We evaluate our system and present a trade-off between minimizing latency and maximizing cost gain in Section III. Finally, section IV concludes the paper and discusses the future directions.
II System model
In our system, we adopt a geo-distributed cloud infrastructure as shown in Fig. 1 that consists of multiple geographically distributed cloud sites owned by a content provider. Our predictive model and resource allocation optimizer are deployed in a centralized master server. A set of geo-distributed crowdsourcers broadcast their videos in real time, which will be allocated by default in their nearest cloud site. Each broadcaster cloud site will report the master server with the incoming live videos information. The predictive model will predict the number of viewers expected near each cloud site. Based on the predicted results, the optimizer will allocate live videos replicas across the geo-distributed cloud sites near the viewers proximity to minimize the delay and video stalls with the minimum possible cost. Moreover, the optimizer determines from which cloud site the viewers should be served. In our work, we consider only the storage resources, while the computation resources for video transcoding are out of the scope of this paper.
II-A Predicting live video viewers
II-A1 Dataset
in our work, we are using the Facebook 2018 live videos dataset collected by our team [12], containing more than two million Facebook live video streams. The active video streams metadata are fetched every 3 minutes in different periods on January, February, March, May, June and July 2018. As a result, we obtained a list of fetches related to each video and containing the number of viewers at the recording time. The live videos are collected with many features such as creation time and day, broadcaster location, number of likes and most importantly the viewers’ locations. In this work, we selected six features for each video namely, the broadcaster name, content category, created time, created day, broadcaster location and the viewers’ locations as illustrated in Fig. 2. The viewers’ locations were selected from the video fetch with maximum number of viewers.
II-A2 Preprocessing
as our objective is to predict the viewers number near various geo-distributed cloud sites, there was a need to preprocess our raw data. First, we mapped the viewers’ locations into 10 Amazon Web Services (AWS) cloud sites locations [13] namely, Asia-Mumbai, Asia-Seoul, Asia-Singapore, China-Ninxgia, Europe-Frankfurt, Europe-Paris, South America-Sao paulo, US East-Ohio, US East-Virginia and US West-California. This was done by calculating the shortest distance between the viewer’s locations and the 10 AWS cloud sites locations. Furthermore, we calculated the number of viewers near each cloud site for each video. We did the same to the broadcaster location, where we mapped his location into the nearest AWS cloud site. Moreover, we clustered the created time into 6 time periods. Finally, we applied the categorical one-hot encoding to the time period, created day and broadcaster location features, while we used feature hashing introduced by [14] to transform the high-cardinality features namely broadcaster name and content category into hashed feature vectors.
II-A3 Predictive model
the dataset used to train our models included 224,839 live video records collected in March, May and June 2018. 80% of the records were randomly selected for training and 20% were used for validation. We trained our regression models to produce 10 outputs as illustrated in Fig. 2, each represents the number of viewers near the 10 AWS cloud sites mentioned previously. We adopted three different ML algorithms namely, Multilayer-perceptron (MLP), Decision trees (DT) and Random Forest (RF). We built several models using each ML algorithm, as there is no method to predetermine the best combination of hyperparameters, such as the number of hidden layers and neurons for MLP models, number of forests for RF models and the max depth for DT models. Finally, the best models were selected considering the best determination coefficient () values, which is used to assess the goodness of fit of our regression models. values approaching 1 indicate that the model provides accurate predictions, and it is calculated according to Eq. (1):
[TABLE]
where m is the number of videos, is the actual number of viewers for video , is the predicted number of viewers for video , and is the mean of the actual number of viewers of all videos.
II-A4 Predictive model results
after training the models, the validation results, depicted in Fig. 3, showed that RF outperforms the other ML algorithms by achieving for example an of 0.91 for Seoul, 0.89 for Sao Paulo, 0.85 for Ohio, 0.86 for California and 0.74 for China. The DT model achieved the lowest as opposed to MLP and RF. The results showed that increasing the number of layers for the MLP models improves the results. However, due to the complexity of the models, and because we noticed that there is a slight difference between the performance of the 5 layers model and the 7 layers model, we did not increase the layers above 7. The results also showed that for all ML models, the predicted number of viewers near some regions achieved a higher compared to other regions, China achieved the lowest, while Seoul and Sao paulo achieved the best . We further tested our models on unseen data of live videos collected from July 1 to July 6, 2018. The models performed the same as with validation data in some regions, slightly less or higher in other regions as shown in Fig. 4. We then extended our experiments by performing the predictions on hourly basis for 24 hours using the live videos of July 3, 2018. The RF and MLP 7 layers models were used for prediction, since they performed better than other models. The predicted number of viewers for the hourly incoming live videos versus the actual number of viewers for Seoul, Frankfurt and China cloud sites are presented in Fig. 5. Since our results demonstrate that the RF predictions are the closest to the actual values, we will adopt this model in our system.
II-B Proactive live video allocation and viewers serving
In this section, we formulate the problem of proactive resource allocation, to derive the optimal number of video allocation cloud sites and the nearest cloud site to serve the viewers, with an objective of minimizing the cost constrained by the access delay. We then, present our proactive resource allocation algorithm.
II-B1 Problem formulation
the set of incoming live videos at period is denoted by ={, , ,….}. The set of regions is represented by ={, , ,….}. Let , and denote the broadcasting region, video allocation region and video serving region respectively. The round trip delay from to is represented by . Let represent the set of predicted viewers for the incoming videos at period . As each video has predicted viewers in different regions, let ={, , ,….} denote the set of the number of predicted viewers at different regions for each video . The broadcasters’ regions for the incoming videos at period is denoted by . Due to the fact that some videos do not have any viewers near some cloud sites, let present a binary variable, equal to 1, if video has predicted viewers near the region , and 0 otherwise.
We consider renting S3 storage [15] servers at each cloud site. Three types of costs are taken into account: (1) the storage cost at each cloud site; (2) the migration cost of a video replica from one cloud site to another and (3) the cost of serving viewers. We assume that the storage capacity can be provisioned based on the application demand. On allocation cloud site at region , let be the storage cost per GB, which varies based on site location and the storage thresholds fixed by Amazon S3. For example, Amazon charges 0.023 when exceeding 500TB in the case of US East Virginia region [15]. Given that is the video size, the total storage cost can be calculated as presented in Eq. 2. Given that is the cost to migrate a copy of a video from the broadcaster region to allocation region , which is the data transfer cost from one cloud site to another per GB, the total migration cost is calculated as presented in Eq. 3. Given that is the serving request cost from region , which is the data transfer cost from that region to the internet per GB, the total serving request cost is calculated as presented in Eq. 4. The overall cost to serve viewers is shown in Eq. 5.
is the predicted number of viewers at region .
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Every video is allocated by default in the broadcaster nearest cloud site.
[TABLE]
A video can be served from region to viewers at region , only if it is allocated at region .
[TABLE]
A video can be served from region to only if there exists viewers at .
[TABLE]
If there exists viewers for video at region , they can only be served from one region.
[TABLE]
The average serving request delay for each video should not exceed a threshold .
[TABLE]
Binary decision variables that can be set to 0 or 1.
[TABLE]
The decision variable is equal to 1, if video is allocated in region , and 0 otherwise. While the decision variable is equal to 1, if viewers at region are served from region and 0 otherwise. The problem formulation notations are presented in Table I.
II-B2 Proactive resource allocation
the proposed proactive resource allocation algorithm is presented in Algorithm 1. In fact, at each period t, the system receives a set of incoming videos, which will be an input to the viewers predictive model. Based on the predicted viewers, the optimal number of allocation cloud sites and the nearest cloud site to serve the viewers will be decided by the optimizer. The storage resources at each cloud site is reserved based on the allocation decisions, and released for ended live videos from the previous periods. Moreover, the viewers are served from their closest cloud site based on the serving decisions.
III Performance Evaluation
III-A Simulation settings
In this section, we evaluate the performance of our system using the RF hourly predicted viewers of July 3, 2018 to get the hourly optimal resource allocation for =24(hours) and =1(hour). The number of hourly incoming videos, and the hourly predicted viewers used in our simulation are presented in Fig. 6. In our system, we assume that the video duration is 4 hours, which is the maximum video duration for a Facebook live video. We assume that if a video is allocated in a set of cloud sites at period , it will be allocated in the same cloud sites for the remaining time periods of streaming. Moreover, because video quality is out of the scope of this paper we assume that the viewers are served with the best video quality, where we set the video size to 0.738 Gbit. We constructed our round trip time (RTT) matrix by calculating the average RTT from the different cloud sites using [16] accessed on September 19, 2018. The storage and data transfer prices of Amazon S3 [15] are considered in our simulation to model , and . We varied the latency thresholds constraints for serving a video to 8.8ms, 60ms, 120ms, 171ms, 220ms and 371ms. 8.8ms is the latency needed to serve a viewer from its closest cloud region [8].
III-B Simulation results
Fig. 7(a) shows that we can establish a trade-off between the video access delay and the resource allocation cost. Indeed, the hourly optimal cost is high when the system is forced to serve the viewers from their region by setting the latency threshold to 8.8ms. Relaxing the threshold leads to minimizing the cost. Therefore, the content provider can sacrifice in terms of cost to enhance the QoE or the opposite based on his requirements. It is worth mentioning that the optimal cost is higher in some periods as opposed to others, because as illustrated in Fig. 6 the number of incoming videos and predicted viewers varies from period to another.
In order to evaluate the total system cost over the 24 hours with various latency thresholds, we calculated the hourly total cost, as presented in Fig. 7(b). The hourly total cost is defined as the sum of the network cost at period t and the cost of storage of still running videos, which is presented in Eq. 7, given that is the storage usage at region until period .
[TABLE]
The system total cost is calculated as shown in Eq. 8:
[TABLE]
Furthermore, we calculated the hits percentages, which represents the percentage of videos served from the same region of viewers as shown in Fig. 7(c). Setting the latency to 8.8ms resulted in hits percentage of 100% in every hour, as all viewers will be served from their region. While it is in the range of 20% to 30% with 60ms latency threshold. Moreover, when the latency threshold was set to 120ms, 171ms, 220ms and 371ms, less than 20% of videos were served from the same region of viewers. The hits percentage was very low with high latency thresholds, as the system is not forced to serve the viewers from their closest region.
Finally, to evaluate the accuracy of our resource allocation framework, we calculated the hourly average latency using the proactive serving decisions with variant latency thresholds . In fact, we calculated the latency of serving the actual number of viewers based on our proactive video allocation and we compared it to the latency derived from the predictive model. The results as shown in Fig. 8 proved that the average latency to serve the actual viewers is very close to the average latency serving the predicted viewers. Moreover, the average latency to serve the actual viewers did not exceed the latency thresholds .
IV Conclusion
In this paper, we propose a proactive resource allocation framework. First, we adopt machine learning to build a predictive model that captures the viewers number near each geo-ditributed cloud site. Then, based on the predicted results, we formulated our resource allocation model as an optimization problem to optimally allocate resources across the geo-distributed cloud sites based on the viewers proximity. For the future work, we plan to design a distributed proactive resource allocation framework. We are also interested in implementing predictive models for the number of incoming live videos, the live video duration, the live videos viewing time and the computation resources.
Acknowledgment
This publication was made possible by NPRP grant 8-519-1-108 from the Qatar National Research Fund (a member of Qatar Foundation). The findings achieved herein are solely the responsibility of the author(s)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] “Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016-2021 White Paper” In Cisco , 2017 URL: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-white-paper-c 11-520862.html
- 2[2] “Facebook users worldwide 2018” In Statista , 2018 URL: https://www.statista.com/statistics/264810/number-of-monthly-active-facebook-users-worldwide/
- 3[3] “Facebook Statistics for 2018” In Word Stream URL: https://www.wordstream.com/blog/ws/2017/11/07/facebook-statistics
- 4[4] Athula Balachandran et al. “Developing a predictive model of quality of experience for internet video” In ACM SIGCOMM 43.4 , 2013, pp. 339–350
- 5[5] S Shunmuga Krishnan and Ramesh K Sitaraman “Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs” In IEEE/ACM Transactions on Networking (TON) 21.6 IEEE Press, 2013, pp. 2001–2014
- 6[6] Yu Wu et al. “Scaling social media applications into geo-distributed clouds” In IEEE/ACM Transactions on Networking (TON) 23.3 IEEE Press, 2015, pp. 689–702
- 7[7] Qiyun He, Jiangchuan Liu, Chonggang Wang and Bo Li “Coping with heterogeneous video contributors and viewers in crowdsourced live streaming: A cloud-based approach” In IEEE Transactions on Multimedia 18.5 IEEE, 2016, pp. 916–928
- 8[8] K Bilal, A Erbad and M Hefeeda “Qo E-aware distributed cloud-based live streaming of multisourced multiview videos” In Journal of Network and Computer Applications 120 Elsevier, 2018, pp. 130–144
