Vision-Language Models vs Human: Perceptual Image Quality Assessment
Imran Mehmood, Imad Ali Shah, Ming Ronnier Luo, and Brian Deegan

TL;DR
This study benchmarks Vision Language Models against human psychophysical data for perceptual image quality assessment, revealing attribute-dependent variability and insights into model alignment with human perception.
Contribution
It systematically evaluates VLMs for IQA, highlighting their strengths and limitations in approximating human perceptual judgments across different image attributes.
Findings
High correlation of VLMs with human judgments on colorfulness (up to 0.93)
VLMs underperform on contrast assessment compared to colorfulness
Model consistency does not always equate to better human alignment
Abstract
Psychophysical experiments remain the most reliable approach for perceptual image quality assessment (IQA), yet their cost and limited scalability encourage automated approaches. We investigate whether Vision Language Models (VLMs) can approximate human perceptual judgments across three image quality scales: contrast, colorfulness and overall preference. Six VLMs four proprietary and two openweight models are benchmarked against psychophysical data. This work presents a systematic benchmark of VLMs for perceptual IQA through comparison with human psychophysical data. The results reveal strong attribute dependent variability models with high human alignment for colorfulness (\rho up to 0.93) underperform on contrast and vice-versa. Attribute weighting analysis further shows that most VLMs assign higher weights to colorfulness compared to contrast when evaluating overall preference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Aesthetic Perception and Analysis
