
TL;DR
This paper introduces the concept of image specificity, proposing methods to measure and predict it, and demonstrates its importance for improving text-based image retrieval by distinguishing between specific and ambiguous images.
Contribution
It defines image specificity, develops automated and human-based measures, and trains models to predict specificity from image features, enhancing retrieval performance.
Findings
Automated measures correlate with human judgments of specificity.
Predicting image specificity improves retrieval accuracy.
Specificity analysis reveals content features influencing image ambiguity.
Abstract
For some images, descriptions written by multiple people are consistent with each other. But for other images, descriptions across people vary considerably. In other words, some images are specific they elicit consistent descriptions from different people while other images are ambiguous. Applications involving images and text can benefit from an understanding of which images are specific and which ones are ambiguous. For instance, consider text-based image retrieval. If a query description is moderately similar to the caption (or reference description) of an ambiguous image, that query may be considered a decent match to the image. But if the image is very specific, a moderate similarity between the query and the reference description may not be sufficient to retrieve the image. In this paper, we introduce the notion of image specificity. We present two mechanisms to measure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
