Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video Quality Assessment
Haoning Wu, Liang Liao, Annan Wang, Chaofeng Chen, Jingwen Hou, Wenxiu, Sun, Qiong Yan, Weisi Lin

TL;DR
This paper introduces a novel zero-shot video quality assessment method leveraging CLIP-based semantic affinity measures, achieving superior generalization and performance without human annotations.
Contribution
It proposes the SAQI and BVQI indices that integrate semantic and low-level features, enabling robust, annotation-free video quality assessment with state-of-the-art results.
Findings
Surpasses existing zero-shot indices by at least 24% on all datasets.
Achieves state-of-the-art performance with a fine-tuning scheme.
Demonstrates superior generalization compared to opinion-driven methods.
Abstract
The proliferation of videos collected during in-the-wild natural settings has pushed the development of effective Video Quality Assessment (VQA) methodologies. Contemporary supervised opinion-driven VQA strategies predominantly hinge on training from expensive human annotations for quality scores, which limited the scale and distribution of VQA datasets and consequently led to unsatisfactory generalization capacity of methods driven by these data. On the other hand, although several handcrafted zero-shot quality indices do not require training from human opinions, they are unable to account for the semantics of videos, rendering them ineffective in comprehending complex authentic distortions (e.g., white balance, exposure) and assessing the quality of semantic content within videos. To address these challenges, we introduce the text-prompted Semantic Affinity Quality Index (SAQI) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Advanced Image Processing Techniques · Visual Attention and Saliency Detection
