PinPoint: Evaluation of Composed Image Retrieval with Explicit Negatives, Multi-Image Queries, and Paraphrase Testing
Rohan Mahadev, Joyce Yuan, Patrick Poirson, David Xue, Hao-Yu Wu, Dmitry Kislyuk

TL;DR
PinPoint introduces a comprehensive benchmark for Composed Image Retrieval that includes multiple correct answers, explicit negatives, paraphrase robustness, multi-image queries, and fairness metrics, revealing current limitations and proposing a reranking solution.
Contribution
The paper presents PinPoint, a new benchmark dataset for CIR with rich annotations and evaluates existing methods, uncovering key challenges and offering a reranking approach to improve performance.
Findings
Current methods achieve 28.5% mAP@10 but still retrieve irrelevant results 9% of the time.
Performance varies by 25.1% across different paraphrases, indicating robustness issues.
Multi-image queries perform 40-70% worse than single-image queries.
Abstract
Composed Image Retrieval (CIR) has made significant progress, yet current benchmarks are limited to single ground-truth answers and lack the annotations needed to evaluate false positive avoidance, robustness and multi-image reasoning. We present PinPoint, a comprehensive real world benchmark with 7,635 queries and 329K relevance judgments across 23 query categories. PinPoint advances the field by providing: (1) multiple correct answers (averaging 9.1 per query) (2) explicit hard negatives, (3) six instruction paraphrases per query for robustness testing, (4) multi-image composition support (13.4% of queries), and (5) demographic metadata for fairness evaluation. Based on our analysis of 20+ methods across 4 different major paradigms, we uncover three significant drawbacks: The best methods while achieving mAP@10 of 28.5%, still retrieves irrelevant results (hard negatives) 9% of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
