Perceptual Evaluation of 360 Audiovisual Quality and Machine Learning Predictions
Randy Frans Fela, Nick Zacharov, S{\o}ren Forchhammer

TL;DR
This study evaluates the prediction of perceived 360 audiovisual quality using objective metrics and machine learning, demonstrating that SVM models with VMAF and AMBIQUAL metrics achieve high accuracy.
Contribution
It introduces a machine learning framework combining multiple quality metrics to predict subjective audiovisual quality in 360 content, with comprehensive evaluation methods.
Findings
SVM outperforms other models in prediction accuracy.
VMAF and AMBIQUAL combination yields the best results.
k-Fold validation provides higher performance metrics.
Abstract
In an earlier study, we gathered perceptual evaluations of the audio, video, and audiovisual quality for 360 audiovisual content. This paper investigates perceived audiovisual quality prediction based on objective quality metrics and subjective scores of 360 video and spatial audio content. Thirteen objective video quality metrics and three objective audio quality metrics were evaluated for five stimuli for each coding parameter. Four regression-based machine learning models were trained and tested here, i.e., multiple linear regression, decision tree, random forest, and support vector machine. Each model was constructed using a combination of audio and video quality metrics and two cross-validation methods (k-Fold and Leave-One-Out) were investigated and produced 312 predictive models. The results indicate that the model based on the evaluation of VMAF and AMBIQUAL is better than other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment
