Triplet Synthesis For Enhancing Composed Image Retrieval via   Counterfactual Image Generation

Kenta Uesugi; Naoki Saito; Keisuke Maeda; Takahiro Ogawa; Miki; Haseyama

arXiv:2501.13968·cs.CV·January 27, 2025

Triplet Synthesis For Enhancing Composed Image Retrieval via Counterfactual Image Generation

Kenta Uesugi, Naoki Saito, Keisuke Maeda, Takahiro Ogawa, Miki, Haseyama

PDF

Open Access

TL;DR

This paper introduces a novel triplet synthesis method using counterfactual image generation to automatically create diverse training data for composed image retrieval, reducing manual effort and improving model performance.

Contribution

It proposes a new automatic triplet synthesis approach leveraging counterfactual images, enhancing dataset diversity and model training efficiency for CIR.

Findings

01

Generated diverse triplets improve CIR accuracy.

02

Automatic synthesis reduces manual annotation effort.

03

Enhanced datasets lead to better retrieval performance.

Abstract

Composed Image Retrieval (CIR) provides an effective way to manage and access large-scale visual data. Construction of the CIR model utilizes triplets that consist of a reference image, modification text describing desired changes, and a target image that reflects these changes. For effectively training CIR models, extensive manual annotation to construct high-quality training datasets, which can be time-consuming and labor-intensive, is required. To deal with this problem, this paper proposes a novel triplet synthesis method by leveraging counterfactual image generation. By controlling visual feature modifications via counterfactual image generation, our approach automatically generates diverse training triplets without any manual intervention. This approach facilitates the creation of larger and more expressive datasets, leading to the improvement of CIR model's performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques