C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network
Munan Xu, Junming Chen, Haiqiang Wang, Shan Liu, Ge Li, Zhiqiang Bai

TL;DR
C3DVQA introduces a 3D convolutional neural network architecture for full-reference video quality assessment, capturing both spatial and temporal features to better evaluate video quality, outperforming existing methods.
Contribution
The paper proposes a novel C3DVQA architecture that integrates feature learning and score pooling using 3D CNNs for improved video quality assessment.
Findings
Achieves state-of-the-art performance on LIVE and CSIQ datasets.
Effectively captures temporal masking effects with 3D convolutional layers.
Outperforms traditional methods in full-reference VQA tasks.
Abstract
Traditional video quality assessment (VQA) methods evaluate localized picture quality and video score is predicted by temporally aggregating frame scores. However, video quality exhibits different characteristics from static image quality due to the existence of temporal masking effects. In this paper, we present a novel architecture, namely C3DVQA, that uses Convolutional Neural Network with 3D kernels (C3D) for full-reference VQA task. C3DVQA combines feature learning and score pooling into one spatiotemporal feature learning process. We use 2D convolutional layers to extract spatial features and 3D convolutional layers to learn spatiotemporal features. We empirically found that 3D convolutional layers are capable to capture temporal masking effects of videos. We evaluated the proposed method on the LIVE and CSIQ datasets. The experimental results demonstrate that the proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
