QuRe: Query-Relevant Retrieval through Hard Negative Sampling in Composed Image Retrieval
Jaehyun Kwak, Ramahdani Muhammad Izaaz Inhar, Se-Young Yun, Sung-Ju Lee

TL;DR
This paper introduces QuRe, a novel method for composed image retrieval that reduces false negatives through hard negative sampling and a reward model, improving alignment with human preferences and achieving state-of-the-art results.
Contribution
The paper proposes QuRe, a new approach combining hard negative sampling and reward modeling to enhance composed image retrieval accuracy and human preference alignment.
Findings
QuRe outperforms existing methods on FashionIQ and CIRR datasets.
QuRe shows the strongest alignment with human preferences on HP-FashionIQ.
The method effectively filters false negatives using relevance score drops.
Abstract
Composed Image Retrieval (CIR) retrieves relevant images based on a reference image and accompanying text describing desired modifications. However, existing CIR methods only focus on retrieving the target image and disregard the relevance of other images. This limitation arises because most methods employing contrastive learning-which treats the target image as positive and all other images in the batch as negatives-can inadvertently include false negatives. This may result in retrieving irrelevant images, reducing user satisfaction even when the target image is retrieved. To address this issue, we propose Query-Relevant Retrieval through Hard Negative Sampling (QuRe), which optimizes a reward model objective to reduce false negatives. Additionally, we introduce a hard negative sampling strategy that selects images positioned between two steep drops in relevance scores following the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
