Half of an image is enough for quality assessment
Junyong You, Yuan Lin, Jari Korhonen

TL;DR
This paper introduces a positional masked transformer for IQA, revealing that only half of an image often suffices for quality assessment, with certain regions being more critical than others.
Contribution
It presents a novel transformer-based model and provides new insights into the regional importance in image quality assessment.
Findings
Half of an image can be sufficient for quality assessment.
Certain image regions have a disproportionate impact on perceived quality.
Semantic measures correlate highly with region importance in IQA.
Abstract
Deep networks have demonstrated promising results in the field of Image Quality Assessment (IQA). However, there has been limited research on understanding how deep models in IQA work. This study introduces a novel positional masked transformer for IQA and provides insights into the contribution of different regions of an image towards its overall quality. Results indicate that half of an image may play a trivial role in determining image quality, while the other half is critical. This observation is extended to several other CNN-based IQA models, revealing that half of the image regions can significantly impact the overall image quality. To further enhance our understanding, three semantic measures (saliency, frequency, and objectness) were derived and found to have high correlation with the importance of image regions in IQA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Advanced Image Fusion Techniques
