A Deep Learning based No-reference Quality Assessment Model for UGC Videos
Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai

TL;DR
This paper introduces a simple, efficient deep learning model for no-reference quality assessment of UGC videos, directly learning spatial features from raw pixels and incorporating motion analysis to improve accuracy.
Contribution
The proposed model uniquely trains an end-to-end spatial feature extractor and employs a multi-scale fusion strategy, outperforming existing methods on multiple UGC VQA datasets.
Findings
Achieves state-of-the-art performance on five UGC VQA datasets.
Utilizes sparse frames for spatial features and low-res dense frames for motion, reducing computational complexity.
Incorporates human visual system insights through multi-scale quality fusion.
Abstract
Quality assessment for User Generated Content (UGC) videos plays an important role in ensuring the viewing experience of end-users. Previous UGC video quality assessment (VQA) studies either use the image recognition model or the image quality assessment (IQA) models to extract frame-level features of UGC videos for quality regression, which are regarded as the sub-optimal solutions because of the domain shifts between these tasks and the UGC VQA task. In this paper, we propose a very simple but effective UGC VQA model, which tries to address this problem by training an end-to-end spatial feature extraction network to directly learn the quality-aware spatial feature representation from raw pixels of the video frames. We also extract the motion features to measure the temporal-related distortions that the spatial features cannot model. The proposed model utilizes very sparse frames to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Advanced Image Fusion Techniques · Visual Attention and Saliency Detection
MethodsAverage Pooling
