DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval

Geon Park; Ji-Hoon Park; Seong-Whan Lee

arXiv:2603.04037·cs.CV·March 5, 2026

DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval

Geon Park, Ji-Hoon Park, Seong-Whan Lee

PDF

Open Access

TL;DR

DQE-CIR enhances composed image retrieval by learning distinctive query embeddings with attribute weights and target relative negative sampling, improving fine-grained attribute discrimination and reducing semantic confusion.

Contribution

The paper introduces a novel method combining learnable attribute weights and target relative negative sampling to improve query discriminativeness in composed image retrieval.

Findings

01

Improved retrieval accuracy for fine-grained attribute modifications.

02

Reduced semantic confusion among similar images.

03

Enhanced discriminativeness of query embeddings.

Abstract

Composed image retrieval (CIR) addresses the task of retrieving a target image by jointly interpreting a reference image and a modification text that specifies the intended change. Most existing methods are still built upon contrastive learning frameworks that treat the ground truth image as the only positive instance and all remaining images as negatives. This strategy inevitably introduces relevance suppression, where semantically related yet valid images are incorrectly pushed away, and semantic confusion, where different modification intents collapse into overlapping regions of the embedding space. As a result, the learned query representations often lack discriminativeness, particularly at fine-grained attribute modifications. To overcome these limitations, we propose distinctive query embeddings through learnable attribute weights and target relative negative sampling (DQE-CIR), a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications