Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network
Shanshan Lao, Yuan Gong, Shuwei Shi, Sidi Yang, Tianhe Wu, Jiahao, Wang, Weihao Xia, Yujiu Yang

TL;DR
This paper introduces a hybrid attention-based neural network combining vision transformers and CNNs to improve image quality assessment, especially for GAN-generated images, outperforming existing methods.
Contribution
The paper proposes a novel hybrid architecture with attention mechanisms and deformable convolutions to enhance spatial relationship modeling in IQA models.
Findings
Outperforms state-of-the-art IQA methods on four datasets.
Ranks first in NTIRE 2022 Perceptual IQA Challenge.
Effectively models spatial relationships among image patches.
Abstract
Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality. Unfortunately, there is a performance drop when assessing the distortion images generated by generative adversarial network (GAN) with seemingly realistic texture. In this work, we conjecture that this maladaptation lies in the backbone of IQA models, where patch-level prediction methods use independent image patches as input to calculate their scores separately, but lack spatial relationship modeling among image patches. Therefore, we propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task. Firstly, we adopt a two-branch architecture, including a vision transformer (ViT) branch and a convolutional neural network (CNN) branch for feature extraction. The hybrid architecture combines interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Advanced Image Fusion Techniques · Visual Attention and Saliency Detection
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Residual Connection · Deformable Convolution · Dense Connections · Vision Transformer · Convolution
