Exploring Rich Subjective Quality Information for Image Quality   Assessment in the Wild

Xiongkuo Min; Yixuan Gao; Yuqin Cao; Guangtao Zhai; Wenjun Zhang,; Huifang Sun; Chang Wen Chen

arXiv:2409.05540·cs.CV·September 10, 2024·25 cites

Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild

Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang,, Huifang Sun, Chang Wen Chen

PDF

Open Access

TL;DR

This paper introduces RichIQA, a novel image quality assessment method that leverages rich subjective rating information like MOS, SOS, and DOS using a three-stage network and multi-label training to improve prediction accuracy and generalization.

Contribution

RichIQA is the first to exploit detailed subjective quality ratings beyond MOS for in-the-wild IQA, combining a three-stage CVT-based network with multi-label training.

Findings

01

Outperforms state-of-the-art on multiple IQA datasets.

02

Effectively predicts quality distributions including MOS, SOS, and DOS.

03

Enhances generalization through multi-label training strategy.

Abstract

Traditional in the wild image quality assessment (IQA) models are generally trained with the quality labels of mean opinion score (MOS), while missing the rich subjective quality information contained in the quality ratings, for example, the standard deviation of opinion scores (SOS) or even distribution of opinion scores (DOS). In this paper, we propose a novel IQA method named RichIQA to explore the rich subjective rating information beyond MOS to predict image quality in the wild. RichIQA is characterized by two key novel designs: (1) a three-stage image quality prediction network which exploits the powerful feature representation capability of the Convolutional vision Transformer (CvT) and mimics the short-term and long-term memory mechanisms of human brain; (2) a multi-label training strategy in which rich subjective quality information like MOS, SOS and DOS are concurrently used…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image Fusion Techniques

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer · Softmax · Label Smoothing · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer