DCVQE: A Hierarchical Transformer for Video Quality Assessment
Zutong Li, Lei Yang

TL;DR
DCVQE introduces a hierarchical Transformer-based approach for no-reference video quality assessment, effectively capturing multi-level quality features and improving robustness through a novel correlation loss.
Contribution
The paper proposes a novel hierarchical Transformer architecture (DCVQE) with a divide-and-conquer strategy for improved NR-VQA performance.
Findings
Outperforms existing methods on multiple datasets
Demonstrates robustness and accuracy in video quality prediction
Effectively models hierarchical quality relationships
Abstract
The explosion of user-generated videos stimulates a great demand for no-reference video quality assessment (NR-VQA). Inspired by our observation on the actions of human annotation, we put forward a Divide and Conquer Video Quality Estimator (DCVQE) for NR-VQA. Starting from extracting the frame-level quality embeddings (QE), our proposal splits the whole sequence into a number of clips and applies Transformers to learn the clip-level QE and update the frame-level QE simultaneously; another Transformer is introduced to combine the clip-level QE to generate the video-level QE. We call this hierarchical combination of Transformers as a Divide and Conquer Transformer (DCTr) layer. An accurate video quality feature extraction can be achieved by repeating the process of this DCTr layer several times. Taking the order relationship among the annotated data into account, we also propose a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Advanced Image Processing Techniques · Advanced Computing and Algorithms
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Softmax · Label Smoothing · Multi-Head Attention · Adam · Dense Connections
