Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Weixia Zhang, Guangtao Zhai, Ying Wei, Xiaokang Yang, Kede, Ma

TL;DR
This paper introduces a multitask learning framework for blind image quality assessment that leverages auxiliary tasks like scene classification and distortion identification to improve accuracy and robustness without reference images.
Contribution
It proposes an automated multitask learning scheme that automatically determines parameter sharing and loss weighting, enhancing BIQA performance by exploiting auxiliary knowledge.
Findings
Outperforms state-of-the-art on multiple IQA datasets
More robust in differentiation competitions
Effectively realigns quality annotations across datasets
Abstract
We aim at advancing blind image quality assessment (BIQA), which predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks, in a way that the model parameter sharing and the loss weighting are determined automatically. Specifically, we first describe all candidate label combinations (from multiple tasks) using a textual template, and compute the joint probability from the cosine similarities of the visual-textual embeddings. Predictions of each task can be inferred from the joint distribution, and optimized by carefully designed loss functions. Through comprehensive experiments on learning three tasks - BIQA, scene classification, and distortion type identification, we verify that the proposed BIQA method 1) benefits from the scene classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Advanced Image Fusion Techniques
