Apples and Oranges? Assessing Image Quality over Content Recognition
Junyong You, Zheng Zhang

TL;DR
This paper proposes a unified Transformer-based model with a novel attention mechanism to perform both image quality assessment and content recognition, demonstrating promising results in multitask learning.
Contribution
It introduces a sequential spatial-channel attention module integrated into a Transformer to jointly perform image quality assessment and content recognition.
Findings
The model achieves competitive performance on both tasks.
Shared spatial attention benefits both recognition and quality assessment.
Channel attention enhances quality assessment accuracy.
Abstract
Image recognition and quality assessment are two important viewing tasks, while potentially following different visual mechanisms. This paper investigates if the two tasks can be performed in a multitask learning manner. A sequential spatial-channel attention module is proposed to simulate the visual attention and contrast sensitivity mechanisms that are crucial for content recognition and quality assessment. Spatial attention is shared between content recognition and quality assessment, while channel attention is solely for quality assessment. Such attention module is integrated into Transformer to build a uniform model for the two viewing tasks. The experimental results have demonstrated that the proposed uniform model can achieve promising performance for both quality assessment and content recognition tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Advanced Image Fusion Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Adam · Position-Wise Feed-Forward Layer · Softmax · Linear Layer · Absolute Position Encodings · Dropout · Label Smoothing
