Foundation Models Boost Low-Level Perceptual Similarity Metrics
Abhijay Ghildyal, Nabajeet Barman, Saman Zadtootaghaj

TL;DR
This paper shows that using intermediate features from foundation models improves low-level perceptual similarity metrics for image quality assessment without additional training.
Contribution
The work demonstrates that intermediate features from foundation models are more effective for perceptual similarity, outperforming traditional and state-of-the-art metrics without training.
Findings
Intermediate features outperform final layer features.
Untrained metrics surpass some learned metrics.
Foundation model features enhance image quality assessment.
Abstract
For full-reference image quality assessment (FR-IQA) using deep-learning approaches, the perceptual similarity score between a distorted image and a reference image is typically computed as a distance measure between features extracted from a pretrained CNN or more recently, a Transformer network. Often, these intermediate features require further fine-tuning or processing with additional neural network layers to align the final similarity scores with human judgments. So far, most IQA models based on foundation models have primarily relied on the final layer or the embedding for the quality score estimation. In contrast, this work explores the potential of utilizing the intermediate features of these foundation models, which have largely been unexplored so far in the design of low-level perceptual similarity metrics. We demonstrate that the intermediate features are comparatively more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Image Retrieval and Classification Techniques
MethodsByte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Dropout · Layer Normalization · Attention Is All You Need · Position-Wise Feed-Forward Layer · Linear Layer · Adam
