Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences   for Image-Text Retrieval

Zhihao Fan; Zhongyu Wei; Zejun Li; Siyuan Wang; Jianqing Fan

arXiv:2111.03349·cs.CV·November 8, 2021·1 cites

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan

PDF

Open Access 1 Repo

TL;DR

This paper introduces TAGS-DC, a method for generating challenging synthetic negative sentences for image-text retrieval, improving model training by enhancing negative sample difficulty and semantic mismatch detection.

Contribution

The paper proposes a novel negative sentence generation approach with masking and refilling, and auxiliary tasks for better semantic mismatch utilization in image-text retrieval.

Findings

01

Improves retrieval performance on MS-COCO and Flickr30K datasets.

02

Demonstrates robustness and faithfulness through extensive analysis.

03

Outperforms current state-of-the-art models.

Abstract

Matching model is essential for Image-Text Retrieval framework. Existing research usually train the model with a triplet loss and explore various strategy to retrieve hard negative sentences in the dataset. We argue that current retrieval-based negative sample construction approach is limited in the scale of the dataset thus fail to identify negative sample of high difficulty for every image. We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples. TAGS-DC is composed of masking and refilling to generate synthetic negative sentences with higher difficulty. To keep the difficulty during training, we mutually improve the retrieval and generation through parameter sharing. To further utilize fine-grained semantic of mismatch in the negative sentence, we propose two auxiliary tasks, namely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

libertfan/tags
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsTriplet Loss