TL;DR
This paper introduces a large-scale real-world video quality dataset and two novel no-reference VQA models, achieving state-of-the-art results and enabling better localization of perceptual distortions in user-generated videos.
Contribution
It presents the largest real-world UGC video quality dataset and two innovative NR-VQA models, advancing the accuracy and interpretability of video quality assessment.
Findings
PVQ achieves state-of-the-art performance on 3 UGC datasets.
The dataset contains 39,000 videos and 5.5 million annotations.
PVQ Mapper effectively localizes perceptual distortions.
Abstract
No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem to social and streaming media applications. Efficient and accurate video quality predictors are needed to monitor and guide the processing of billions of shared, often imperfect, user-generated content (UGC). Unfortunately, current NR models are limited in their prediction capabilities on real-world, "in-the-wild" UGC video data. To advance progress on this problem, we created the largest (by far) subjective video quality dataset, containing 39, 000 realworld distorted videos and 117, 000 space-time localized video patches ('v-patches'), and 5.5M human perceptual quality annotations. Using this, we created two unique NR-VQA models: (a) a local-to-global region-based NR VQA architecture (called PVQ) that learns to predict global video quality and achieves state-of-the-art performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
