ATTIQA: Generalizable Image Quality Feature Extractor using   Attribute-aware Pretraining

Daekyu Kwon; Dongyoung Kim; Sehwan Ki; Younghyun Jo; Hyong-Euk Lee,; and Seon Joo Kim

arXiv:2406.01020·cs.CV·October 8, 2024

ATTIQA: Generalizable Image Quality Feature Extractor using Attribute-aware Pretraining

Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee,, and Seon Joo Kim

PDF

Open Access

TL;DR

This paper introduces ATTIQA, a pretraining framework that leverages vision-language models and attribute-aware pseudo-labels to create a generalizable and scalable image quality assessment model with state-of-the-art performance.

Contribution

It proposes a novel attribute-aware pretraining method that extracts quality-related knowledge from vision-language models to improve generalization in no-reference image quality assessment.

Findings

01

Achieves state-of-the-art results on multiple IQA datasets.

02

Demonstrates strong generalization capabilities across different datasets.

03

Enables applications like evaluating image generation and enhancement models.

Abstract

In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalability. In this work, we propose a novel pretraining framework that constructs a generalizable representation for IQA by selectively extracting quality-related knowledge from VLM and leveraging the scalability of large datasets. Specifically, we select optimal text prompts for five representative image quality attributes and use VLM to generate pseudo-labels. Numerous attribute-aware pseudo-labels can be generated with large image datasets, allowing our IQA model to learn rich representations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Fusion Techniques · Image Retrieval and Classification Techniques · Face and Expression Recognition