iDETEX: Empowering MLLMs for Intelligent DETailed EXplainable IQA
Zhaoran Zhao, Xinli Yue, Jianhui Sun, Yuhao Xie, Tao Shao, Liangchao Yao, Fan Xia, Yuetang Deng

TL;DR
iDETEX is a novel multimodal large language model that provides detailed, explainable image quality assessments by integrating multiple tasks and advanced training strategies, achieving state-of-the-art results on large-scale benchmarks.
Contribution
The paper introduces iDETEX, a unified MLLM capable of simultaneous quality grounding, perception, and description, with innovative training modules for improved generalization and interpretability.
Findings
Achieves state-of-the-art performance on ViDA-UGC benchmark.
Ranks first in the ICCV MIPI 2025 Detailed Image Quality Assessment Challenge.
Demonstrates robustness and effectiveness in detailed, explainable IQA.
Abstract
Image Quality Assessment (IQA) has progressed from scalar quality prediction to more interpretable, human-aligned evaluation paradigms. In this work, we address the emerging challenge of detailed and explainable IQA by proposing iDETEX-a unified multimodal large language model (MLLM) capable of simultaneously performing three key tasks: quality grounding, perception, and description. To facilitate efficient and generalizable training across these heterogeneous subtasks, we design a suite of task-specific offline augmentation modules and a data mixing strategy. These are further complemented by online enhancement strategies to fully exploit multi-sourced supervision. We validate our approach on the large-scale ViDA-UGC benchmark, where iDETEX achieves state-of-the-art performance across all subtasks. Our model ranks first in the ICCV MIPI 2025 Detailed Image Quality Assessment Challenge,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment
