Beyond Cosine Similarity: Magnitude-Aware CLIP for No-Reference Image Quality Assessment

Zhicheng Liao; Dongxu Wu; Zhenshan Shi; Sijie Mai; Hanwei Zhu; Lingyu Zhu; Yuncheng Jiang; Baoliang Chen

arXiv:2511.09948·cs.CV·February 3, 2026

Beyond Cosine Similarity: Magnitude-Aware CLIP for No-Reference Image Quality Assessment

Zhicheng Liao, Dongxu Wu, Zhenshan Shi, Sijie Mai, Hanwei Zhu, Lingyu Zhu, Yuncheng Jiang, Baoliang Chen

PDF

Open Access 1 Video

TL;DR

This paper enhances CLIP-based no-reference image quality assessment by incorporating a magnitude-aware cue and adaptive fusion, significantly improving performance without additional training.

Contribution

Introduces a magnitude-aware auxiliary cue and confidence-guided fusion scheme to improve CLIP-based IQA performance.

Findings

01

Outperforms standard CLIP-based IQA methods on multiple benchmarks.

02

Effective normalization of CLIP features via Box-Cox transformation.

03

No task-specific training required for the proposed method.

Abstract

Recent efforts have repurposed the Contrastive Language-Image Pre-training (CLIP) model for No-Reference Image Quality Assessment (NR-IQA) by measuring the cosine similarity between the image embedding and textual prompts such as "a good photo" or "a bad photo." However, this semantic similarity overlooks a critical yet underexplored cue: the magnitude of the CLIP image features, which we empirically find to exhibit a strong correlation with perceptual quality. In this work, we introduce a novel adaptive fusion framework that complements cosine similarity with a magnitude-aware quality cue. Specifically, we first extract the absolute CLIP image features and apply a Box-Cox transformation to statistically normalize the feature distribution and mitigate semantic sensitivity. The resulting scalar summary serves as a semantically-normalized auxiliary cue that complements cosine-based prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond Cosine Similarity: Magnitude-Aware CLIP for No-Reference Image Quality Assessment· underline

Taxonomy

TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Advanced Image Fusion Techniques