Enhancing Image Quality Assessment Ability of LMMs via Retrieval-Augmented Generation
Kang Fu, Huiyu Duan, Zicheng Zhang, Yucheng Zhu, Jun Zhao, Xiongkuo Min, Jia Wang, and Guangtao Zhai

TL;DR
This paper introduces IQARAG, a training-free retrieval-augmented framework that significantly improves Large Multimodal Models' ability to assess image quality without fine-tuning, using retrieved reference images and prompts.
Contribution
IQARAG is a novel, training-free method that enhances LMMs' image quality assessment performance by integrating retrieval-augmented generation techniques.
Findings
IQARAG improves IQA performance across multiple datasets.
It offers a resource-efficient alternative to fine-tuning.
Extensive experiments validate its effectiveness.
Abstract
Large Multimodal Models (LMMs) have recently shown remarkable promise in low-level visual perception tasks, particularly in Image Quality Assessment (IQA), demonstrating strong zero-shot capability. However, achieving state-of-the-art performance often requires computationally expensive fine-tuning methods, which aim to align the distribution of quality-related token in output with image quality levels. Inspired by recent training-free works for LMM, we introduce IQARAG, a novel, training-free framework that enhances LMMs' IQA ability. IQARAG leverages Retrieval-Augmented Generation (RAG) to retrieve some semantically similar but quality-variant reference images with corresponding Mean Opinion Scores (MOSs) for input image. These retrieved images and input image are integrated into a specific prompt. Retrieved images provide the LMM with a visual perception anchor for IQA task. IQARAG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Advanced Image Processing Techniques
