MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion
Jing Wang, Haotian Fan, Xiaoxia Hou, Yitian Xu, Tao Li, Xuechao Lu and, Lean Fu

TL;DR
This paper introduces MSTRIQ, a no-reference image quality assessment method using Swin Transformer with multi-stage feature fusion, effective data augmentation, and ranking-based optimization, achieving state-of-the-art results on standard datasets.
Contribution
The paper presents a novel Swin Transformer-based IQA model with multi-stage feature fusion and ranking optimization, addressing dataset limitations and improving performance.
Findings
Outperforms previous methods on standard IQA datasets.
Ranks 2nd in NTIRE 2022 IQA challenge no-reference track.
Demonstrates robustness across diverse image distortions.
Abstract
Measuring the perceptual quality of images automatically is an essential task in the area of computer vision, as degradations on image quality can exist in many processes from image acquisition, transmission to enhancing. Many Image Quality Assessment(IQA) algorithms have been designed to tackle this problem. However, it still remains un settled due to the various types of image distortions and the lack of large-scale human-rated datasets. In this paper, we propose a novel algorithm based on the Swin Transformer [31] with fused features from multiple stages, which aggregates information from both local and global features to better predict the quality. To address the issues of small-scale datasets, relative rankings of images have been taken into account together with regression loss to simultaneously optimize the model. Furthermore, effective data augmentation strategies are also used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Image and Video Quality Assessment · Infrared Target Detection Methodologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Byte Pair Encoding · Dense Connections · Dropout · Absolute Position Encodings · Stochastic Depth · Position-Wise Feed-Forward Layer
