Learning from Experience: A Dynamic Closed-Loop QoE Optimization for Video Adaptation and Delivery
Imen Triki, Quanyan Zhu, Rachid Elazouzi, Majed Haddad, Zhiheng Xu

TL;DR
This paper introduces a dynamic closed-loop framework that uses user feedback to learn and optimize QoE for video delivery, accounting for individual user perceptions and improving overall experience.
Contribution
It presents a novel closed-loop control system that adapts video quality based on subjective user feedback, addressing heterogeneity in user perceptions.
Findings
System converges to a steady state with improved QoE
User feedback-driven optimization enhances personalized video quality
Framework effectively learns and adapts to individual user preferences
Abstract
The quality of experience (QoE) is known to be subjective and context-dependent. Identifying and calculating the factors that affect QoE is indeed a difficult task. Recently, a lot of effort has been devoted to estimate the users QoE in order to improve video delivery. In the literature, most of the QoE-driven optimization schemes that realize trade-offs among different quality metrics have been addressed under the assumption of homogenous populations. Nevertheless, people perceptions on a given video quality may not be the same, which makes the QoE optimization harder. This paper aims at taking a step further in order to address this limitation and meet users profiles. To do so, we propose a closed-loop control framework based on the users(subjective) feedbacks to learn the QoE function and optimize it at the same time. Our simulation results show that our system converges to a steady…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 1
Figure 2
Figure 9
Figure 10| Number of macro cells | 1 |
|---|---|
| Number of UEs per cell | 10 |
| eNb Tx Power | 46 dBm |
| eNb noise figure | 5 dB |
| UE noise figure | 9 dB |
| Pathloss model | COST 231 |
| MAC scheduler | Proportional fair 50 RBs |
| Fading model | Pedestrian |
| Transmission model | MIMO Transmit diversity |
| Mobility model | RandomWalk2dMobilityModel |
| Velocity of users | Uniform [5,16] m/s |
| EPS bearer | NGBR-VIDEO-TCP-DEFAULT |
| Fading model | Pedestrian |
| Simulation length | 70 s |
| Window Size | 70 s |
|---|---|
| Throughput Time Slot | 1 s |
| Video Length | 30 s |
| Segment Length | 1s |
| Video frame rate | 30 fps |
| Playback cache | 5s |
| Bit-rate levels Mbps | [0.4 0.75 1 2.5 4.5] |
| Maximum number of stalls (p) | 1 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Video Coding and Compression Technologies
Learning from Experience: A Dynamic Closed-Loop QoE Optimization for Video Adaptation and Delivery
Imen Triki⋆, Quanyan Zhu∘, Rachid El-Azouzi⋆, Majed Haddad⋆ and Zhiheng Xu∘
⋆CERI/LIA, University of Avignon, Avignon, France.
∘ NYU Tandon School of Engineering, New York, USA.
Abstract
The quality of experience (QoE) is known to be subjective and context-dependent. Identifying and calculating the factors that affect QoE is indeed a difficult task. Recently, a lot of effort has been devoted to estimate the users’ QoE in order to improve video delivery. In the literature, most of the QoE-driven optimization schemes that realize trade-offs among different quality metrics have been addressed under the assumption of homogenous populations. Nevertheless, people perceptions on a given video quality may not be the same, which makes the QoE optimization harder. This paper aims at taking a step further in order to address this limitation and meet users’ profiles. To do so, we propose a closed-loop control framework based on the users’ (subjective) feedbacks to learn the QoE function and optimize it at the same time. Our simulation results show that our system converges to a steady state, where the resulting QoE function noticeably improves the users’ feedbacks.
Index Terms:
QoE, learning, neural network, average video quality, startup delay, video quality switching, video stalls, rebuffering delay.
I Introduction
Due to the emergence of smartphones in human daily life and the tremendous advances in broadband access technologies, video streaming services have greatly evolved over the last years to become one of the most provided services in the Internet. According to Cisco [1], http video streaming will be of all internet consumers’ traffic by , up from in 2015. It is then well understood that more and more interest is being today accorded to video streaming services in the hope of making all people satisfied with the video delivered quality. Nevertheless, satisfying all users at once is a hard task; in fact, someone may appreciate a video quality that someone else may not appreciate at all. This makes the study of the QoE too complex.
In the literature, different studies have been explored to express the user’s QoE as an explicit function of some metrics. Some works claim that the QoE can be directly mapped to some QoS metrics such as the throughput, the jitter and the packet loss [2, 3]. Other recent works found that the QoE could be expressed through some application metrics such as the frequency of video freezing (stalls), the startup delay, the average video quality and the dynamic of the quality changing during the streaming session [4, 5]. However, the QoE may change depending on the video context and some other external factors linked to the user himself [6], which explains the trend of using new standardized subjective metrics such as the MOS (Mean Opinion Score) and the users’ engagement rate [7, 8, 9]. A direct relation between the time spent in rebuffering and the user’s engagement has been studied in [9].
In the industry, various adaptive video streaming solutions have been explored to meet the users’ expectations, such as Microsoft’s smooth streaming, Adobe’s HTTP dynamic streaming and Apple’s live streaming. All these solutions use the well-known DASH (Dynamic Adaptive Streaming over HTTP) standard. Despite the emergence of several proposals to improve the QoE, there is still no consensus across these solutions since the users’ perceptions are quite different.
The main motivation of this work is to make DASH deal with the very wide heterogeneity of people. At the core, lies the idea of performing real time supervision on the users’ real perceptions to permanently improve the performance of quality adaptation. To the best of our knowledge, such a paradigm has not been yet investigated for adaptive video streaming. In the literature, we found that the users’ feedbacks on the video quality delivery were only used to study the human perceptions or to validate some analytical QoE models to help predict the QoE [6, 10, 4, 11].
In [11], QoE prediction was performed by incorporating machine learning, users’ feedbacks and some of the QoE-related features such as rebuffering and memory-effect. Following the same trend, we combine machine learning with a QoE-maximization problem in a closed-loop architecture to dynamically adapt video quality with respect to the users’ feedbacks. We focus on two main ideas: maximizing the feedbacks returned by users and exploiting the knowledge of future throughput. Note that, thanks to the exploitation of Big Data in network capacity modelling and prediction, throughput estimation becomes possible today and may go up to few seconds to the future[12]. Though, very few papers were exploiting the knowledge of future throughput variation [13, 14, 15]. In [13], authors designed a QoE-driven optimization problem and proposed a cross-layer scheme to minimize the cost of capacity usage and to avoid video stalls under the assumption of a perfect throughput knowledge. The main shortcoming of their approach is that it is only suited for classical video streaming as it ignores important visual quality metrics related to adaptive streaming such as video resolution and bitrate distribution. Holding the same assumption, authors in [14] proposed a proactive video content delivery algorithm, called NEWCAST, to adjust the quality of adaptive video streaming over the future horizon. Their work was afterward extended in [15] to make their algorithm well suited for unperfected throughput prediction.
Our contribution in this paper is twofold:
- •
We exploit the knowledge of future throughput variations in order to solve the optimization problem addressed in [16] in a smoother and faster manner based on similar mathematical analysis than in [14].
- •
We design a closed-loop framework based on client-server interactions to learn the overall users’ perceptions and to fittingly optimize the quality of the streaming. The performance of our proposed framework is obtained using Matlab and NS3 simulations under multi-user scenario.
The paper is organized as follows: In Section II, we formulate the single-user QoE-optimization problem. Then, in Section III, we discuss the strategy of the optimal solution and propose a heuristic that performs close to the optimal solution. In Section IV, we address the multi-user case and propose a closed-loop based framework using neural networks. Then, in Section V, we evaluate the performance of this framework through some numerical results. Section VI concludes the paper.
II Single-user QoE problem formulation
II-A The video streaming model
We model a video as a set of segments (or shunks) of equal duration in second. Each segment is composed of N frames and is stored on the streaming server at different quality representations. Each representation designs a video encoding rate (hereinafter called bit-rate). Denote by the available video bit-rates where . We suppose that all the video frames are played with a deterministic rate, e.g 25 frames per second (fps). Denote by this frame rate.
For each segment, the player indicates to the server the quality needed for streaming it. Let be the bit-rate associated to segment and be the set of bit-rates associated to all video segments. We assume that the video playback buffer is big enough to avoid eventual buffer overflow events. We denote by the number of segments that the playback buffer contains at time . At the beginning of the streaming session, a prefetching stage is introduced to avoid future buffer underflows; seconds of video (corresponding to segments) have to be completely appended to the buffer before starting playing the video. When there are no segments in the playback buffer, the video stops and a new prefetching stage is introduced to append again seconds of video before pursuing the lecture. This event is, hereinafter, referred to as video stall.
In our study, we exploit the knowledge of the user’s throughput over a given horizon to the future . Before starting the session, we propose to set all the video segments’ bit-rates, to be optimally streamed over that horizon . We denote by the user’s estimated throughput at time , and by the video bit-rate scheduled to be streamed at that time. Note that will only depend on the throughput variation and the set of segments’ bit-rates . In the following we denote it by .
To model the dynamic of the playback buffer, we define two different phases:
- •
the start-up/rebuffering phase: referred to as BaW-phase111BaW:Buffer and Wait, where the media player only downloads the video without playing it
- •
the playback phase: referred to as BaP-phase222BaP:Buffer and Play, where the player downloads and plays the video at the same time.
Depending on the state of the buffer at each time of observation , we define variables and as follows:
if the player is on a BaP-phase, defines the time at which that phase has started 2. 2.
if the player is on a BaW-phase, defines the time at which the next BaP-phase will start 3. 3.
if the buffer is empty, determines the duration of the following BaW-phase 4. 4.
if the buffer is not empty, takes zero
which can be mathematically expressed as
[TABLE]
[TABLE]
where
[TABLE]
Hence the dynamic of the playback buffer can be written as
[TABLE]
where is used to ensure that the playback buffer occupancy cannot be negative.
II-B The QoE-Optimization Problem
The goal of bit-rate adaptation in video streaming services is to improve the users’ globally perceived quality of the video. In the literature, it is challenging to quantitatively define the QoE as it encompasses many complex factors such as the user’s mood, the time and the way one watches the video, the video context, etc. In this work, we use five of the most common key QoE metrics to express our objective QoE function.
The average video quality , which is the average per-segment quality over all segments given by
[TABLE] 2. 2.
The startup delay ratio, which is the proportion of time that takes the first BaW-phase before starting the video:
[TABLE]
where is the video length in seconds. 3. 3.
The average number of video quality switching given by
[TABLE] 4. 4.
The number of video stalls:
[TABLE] 5. 5.
The rebuffering delay ratio, which is the proportion of time that take all the rebuffering events:
[TABLE]
As the user’s preference on each of these QoE metrics may not be the same, we assign to each metric a weighting parameter to adjust its impact on the global QoE variation. As done in previous works [16], we model our global QoE as a linear function of the weighted five aforementioned QoE metrics, namely,
[TABLE]
where .
Let be the vector of weights and be the vector of QoE metrics.
If we assume that the user tolerates at most stalls during the hole session, we end up formulating our single-user QoE optimization problem as follows
[TABLE]
[TABLE]
where the first constraint ensures that the whole video will be streamed at the end of the future horizon.
III Proposed Solution for single-user QoE problem
The QoE optimization problem defined in (10) is a combinatorial problem with a very high complexity (NP hard). In [16], authors were addressing a similar problem, but they were assuming an inaccurate throughput estimation, which justifies their choice to adopt an MPC model to solve their QoE optimization problem. The assumption of accurately knowing the future with adaptive video streaming was explored in [14] where authors proposed an ascending bitrate strategy to optimize the video delivery. In this paper, we characterize an important propriety of the optimal strategy, which allows us to propose a heuristic approach that performs close to the optimal solution.
III-A Propriety of the optimal solution: Ascending bit-rate strategy per BaW-BaP cycle:
Definition 1**.**
We say a bit-rate strategy is ascending per BaW-BaP cycle, if the quality level of the segments increases during each BaW-BaP cycle of the streaming session.
Proposition 1**.**
Assume that there exists a solution that satisfies the constraints in (10), then there exists an ascending bit-rate per BaW-BaP cycle solution that optimizes problem (10).
Proof.
We shall show that for any feasible strategy that satisfies the constrains in (10), there exits an ascending bitrate per BaW-BaP cycle strategy such that
[TABLE]
Here, we distinguish two cases:
- •
Case where the session is composed of one BaW-BaP cycle, i.e, no stall during the session: Without loss of generality, and for the sake of illustration, we assume that we can stream and play the video in a smooth way under a non-ascending bitrate strategy . Then, there such that . Let and be the requesting times of and , respectively, as illustrated in Fig. 2. If we switch between qualities of segments and , then the buffer state will be more relaxed toward the stall constraint since segment will be streamed in a shorter time and, then, the following segments will be appended sooner to the buffer, which may not induce buffer stalls. That said, if we reorder in an ascendant way, the video will not experience any stall. Let be the resulting set after reordering in an ascending way, then we have
[TABLE]
As we keep the same selected bitrates in as in , the average per segment bitrate will not change, which gives
[TABLE]
Since is an ascending strategy, the video session will start with the lowest quality used by . Hence, the startup delay will be reduced by using compared to . Therefore,
[TABLE]
Now, let be the number of qualities selected by . Thus, the number of quality switching under strategy will be at least . On the other hand, strategy will experience exactly quality switching since the quality level of segments increases during the session. Therefore, we have
[TABLE]
All things considered, we have
[TABLE]
- •
Case where the session is composed of more than one BaW-BaP cycle, i.e, one or more stalls during the session: Here, we assume that, for a given horizon window, we can stream the video under a non-ascending bitrate strategy with stall events over the session (). Undoubtedly, reordering all the segments bitrates in an ascending way will add more protection to the buffer against the stall constraint, which may reduce the number of stalls . However, it does not mean that the global QoE will increase, because the duration of the stalls will change depending on the new moments of their occurrence, their corresponding requested qualities and the dynamic of the user throughput. For these reasons, our ascending bitrate strategy will not work per a hole session. In an other hand, we admit that, when a stall happens, the buffer state becomes independent of its previous states before the stall, which makes all the BaW-BaP cycles independent from each other. Let’s write , where denotes the set of bitrates used on the BaW-BaP cycle.
If we apply our previous ascending strategy on each of the () BaW-BaP cycles, we end up reducing the duration of all the BaW-phases (including the startup and the rebuffering delays) and the global number of quality switching, while maintaining the same number of stalls and the same average quality.
Let , then we have
[TABLE]
This concludes the proof.
∎
III-B Algorithm for Optimal Solution
In this section, we describe the main steps to follow to build an optimal solution of at most of stalls during the streaming session. The key idea of this algorithm is stall enforcement; As we assume knowing the future throughput, we are able to enforce stalls at any moment of the streaming session. Once we locate the stalls’ positions (at the level of witch segment each stall should happen), we devide the session into multiple BaW-BaP cycles then look for the optimal ascending bit-rate strategy over each cycle. The optimal number of stalls is obtained through an exhaustive research; we start computing the optimal strategy with zero stalls, then with one stall up to stalls. The stalls’ distribution is also obtained through an exhaustive research. In what follows, we describe the main steps to build an optimal ascending bit-rate strategy over one BaW-BaP cycle:
i) Find all the possible ascending bit-rate combinations of the BaW-phase that allow to build an ascending bit-rate strategy over the hole BaW-BaP cycle (step A and B in Fig.3).
ii) For each BaW-phase combination, find all the possible ascending strategies that satisfy the constraints of (10) (steps 1, 2 and 3 in Fig.3).
iii) For each strategy, compute the QoE metrics then apply the vector of weights to find the best solution.
To find all the possible ascending strategies, use the tree of choice described in [14]IV-3.
III-C Heuristic for a Sub-optimal Solution
The key idea of our heuristic is twofold:
-
The way we found the ascending strategy on a BaW-BaP cycle is different than the optimal strategy; Once we fix the bit-rates combination of the BaW-phase, we progressively increase the bit-rates of the BaP-phase starting by the end of the BaW-BaP cycle till reaching the point (segment) at the level of which a stall will happen if we keep increasing the quality ( segment in step A-2 and segment in step A-3 of Fig.3). Given that the number of segments of the BaW-phase is small in general, it does not take much time to find all the possible ascending BaW-phase combinations, which makes our heuristic fast (see Algorithm 1).
-
rather than doing an exhaustive research on the number of stalls we process as follows: we start finding the optimal strategy with zero stalls, then, we check if the global QoE will increase with one stall enforcement (we try all the possible positions). If it does, we try to enforce a second stall, if not, we stop and return the latest strategy. We keep increasing the number of stalls as described till reaching the maximum or till the QoE function decreases. See Algorithm (2), where , denotes the position of the stall.
IV Multi-user QoE optimization problem
IV-A Problem formulation
In this section, we extend the QoE optimization problem to the multi-user case. We propose to find the vector of weights that maximizes the QoE among all users. The main objective is to maximize the users’ feedbacks on the video delivery using a synthetic QoE dataset. The QoE problem of the multi-user case can be mathematically expressed as
[TABLE]
where is the throughput of user and is his feedback on the quality he received after QoE optimization (10) using vector .
IV-B Practical solution: Closed-loop- based framework with users’ feedbacks
IV-B1 Framework design
The multi-user QoE optimization problem requires to solve problem (10) for each user , knowing the exact value of vector that meets all the users’ preferences. The challenge is then to combine single user QoE optimization with a QoE training mechanism in a closed-loop manner to progressively learn the value of . To do so, we develop two sub-frameworks and make them interact together within a closed-loop based framework, one is for QoE optimization and the other is for QoE training (see Fig.4 and Fig.5);
IV-B2 QoE training tool
To compute , we use a simple neural network [17], where the training samples are couples of QoE metrics and user feedback. We define the training dataset as , where is the vector of QoE metrics delivered by (10) under throughput and vector . being the corresponding feedback.
We define the activation function of the neural network as a linear function , where is the input vector and is the vector of weights to learn (See Fig. 6).
We make use of a mini-batch learning algorithm based on the gradient descent. The goal behind using the gradient descent is to minimize the average error rate between and the network output , .
Let be the half squared error corresponding to the training sample and be the averaged error among training samples, namely
[TABLE]
[TABLE]
To reduce the average loss, the gradient descent updates the vector of weights in a way that it moves oppositely to the direction of the gradient vector . The algorithm stops when a predefined minimum loss is reached or when the number of updating steps is above a given threshold . (See Algorithm 3).
The partial derivatives of in function of the weights , are given by
[TABLE]
[TABLE]
V Numerical results
V-A Simulation environment
We evaluate the performance of the proposed framework through extensive simulations using NS3 and Matlab. NS3 was used to generate standard-compliant correlated throughputs. To get many throughput samples, we performed extensive simulations of an LTE network by varying the mobility of users each time. We put all NS3 parameter settings in Table I. The QoE optimization sub-framework and the QoE trainer were both developed using Matlab. As in real world, we consider users’ feedbacks as scores rated from 1 to 5. When a quality is delivered to a user, we look through the predefined synthetic QoE dataset to find the score it may give. In the dataset, we put all the possible values of vector in a specific priority order, i.e., . These vectors were then grouped in classes. To each class we associated a MOS and a specific distribution of scores. When is delivered, we determine the class to which it belongs. Then, according to that class we randomly generate a score based on the distribution of scores in the dataset. Note that the throughput samples used at the level of the QoE optimization sub-framework were randomly selected (according to a Uniform distribution) among throughput samples generated with NS3. All Matlab parameter settings are listed in Table II.
V-B Performance results
The performance evaluation of the closed-loop based framework allows us to show that (i) the learning converges ultimately to a steady state, in which the learning output is a quasi-constant vector , and that (ii), more importantly, this vector achieves the highest QoE compared to the other vectors computed throughout the learning process.
In Fig. 10, we show the evolution of the mean square variation of vector during the learning process for different values of the mini-batch size. Results show that for all cases, the variation tends to zero, although the decrease is slow in some cases (case of 5 and 50 scores). A fast convergence is however noticed in the case of 10 scores. The difference in the convergence time is actually due to the random character of the throughput selection and the scores generation. In a second step, we were comparing the final outputs . We noticed that they were not exactly the same. Hence, we computed the MOS when each of the previous updated values of was applied with the QoE optimization sub-framework under randomly selected throughputs. Fig. 10 shows that for the four mini-batch sizes, the MOS experiences some fluctuations with the first values of . Then, when it tends to the values obtained at the steady state, it converges to the highest MOS value (around 4.8 for the four cases). These results offer hope that the proposed closed-loop based framework can be designed around QoE optimization for video adaptation and delivery in real-world environment.
VI conclusion
In this paper, we have addressed a QoE optimization problem with machine learning to optimize the quality of the delivered video by fitting the real profiles of the users. We have proposed a closed-loop framework based on the users’ feedbacks to learn their corresponding QoE function and to proceed to their QoE optimization. By using a synthetic QoE dataset, we have shown the efficiency of the proposed closed-loop system. Indeed, the QoE function learned at the steady state ensures a high quality delivery for the majority of users. These promising results allow us to gain insight on how QoE optimization problem can be handled in a heterogeneous population. As a future step, real scores on real video streaming will be collected in order to study the robustness of the proposed solution.
Acknowledgement
This research is partially supported by NSF grants CNS-1720230, CNS-1544782, and SES-1541164.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Cisco Visual Networking Index: Forecast and Methodology, 2015-2020 . http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/complete-white-paper-c 11-481360.pdf.
- 2[2] M. G. Pineda, S. Felici-Castell, and J. Segura-Garcia, “Using factor analysis techniques to find out objective video quality metrics for live video streaming over cloud mobile media services,” Network Protocols & Algorithms , 2016.
- 3[3] M. S. Mushtaq, B. Augustin, and A. Mellouk, “Empirical study based on machine learning approach to assess the qos/qoe correlation,” in Networks and Optical Communications (NOC), 2012 17th European Conference on , 2012.
- 4[4] J. D. Vriendt, D. D. Vleeschauwer, and D. C. Robinson, “Qo E model for video delivered over an LTE network using HTTP adaptive streaming,” Bell Labs Technical Journal , 2014.
- 5[5] A. Balachandran, V. Sekar, A. Akella, S. Seshan, I. Stoica, and H. Zhang, “A quest for an internet video quality-of-experience metric,” in Proceedings of the 11th ACM Workshop on Hot Topics in Networks , ACM, 2012.
- 6[6] L. Amour, S. Souihi, S. Hoceini, and A. Mellouk, “A Hierarchical Classification Model of Qo E Influence Factors,” 13th International Conference Wired/Wireless Internet Communications (WWIC) Revised Selected Papers , pp. 225–238, 2015.
- 7[7] F. Dobrian, V. Sekar, A. Awan, I. Stoica, D. Joseph, A. Ganjam, J. Zhan, and H. Zhang, “Understanding the impact of video quality on user engagement,” ACM SIGCOMM Computer Communication Review , vol. 41, no. 4, pp. 362–373, 2011.
- 8[8] Youtube: Measure video ad performance . https://support.google.com/youtube/answer/2375431.
