Towards Explainable In-the-Wild Video Quality Assessment: A Database and   a Language-Prompted Approach

Haoning Wu; Erli Zhang; Liang Liao; Chaofeng Chen; Jingwen Hou; Annan; Wang; Wenxiu Sun; Qiong Yan; Weisi Lin

arXiv:2305.12726·cs.CV·August 4, 2023·1 cites

Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach

Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan, Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-dimensional database for in-the-wild video quality assessment and proposes a language-prompted model, MaxVQA, that jointly evaluates specific quality factors and overall video quality with high accuracy.

Contribution

The paper presents the Maxwell database with detailed quality factor annotations and a novel CLIP-based language-prompted VQA method, MaxVQA, for comprehensive quality evaluation.

Findings

01

MaxVQA achieves state-of-the-art accuracy across all quality dimensions.

02

The Maxwell database enables detailed analysis of quality factors and their relation to subjective scores.

03

MaxVQA generalizes well to existing datasets, demonstrating robustness.

Abstract

The proliferation of in-the-wild videos has greatly expanded the Video Quality Assessment (VQA) problem. Unlike early definitions that usually focus on limited distortion types, VQA on in-the-wild videos is especially challenging as it could be affected by complicated factors, including various distortions and diverse contents. Though subjective studies have collected overall quality scores for these videos, how the abstract quality scores relate with specific factors is still obscure, hindering VQA methods from more concrete quality evaluations (e.g. sharpness of a video). To solve this problem, we collect over two million opinions on 4,543 in-the-wild videos on 13 dimensions of quality-related factors, including in-capture authentic distortions (e.g. motion blur, noise, flicker), errors introduced by compression and transmission, and higher-level experiences on semantic contents and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vqassessment/maxvqa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Advanced Image Processing Techniques · Visual Attention and Saliency Detection

MethodsContrastive Language-Image Pre-training