TL;DR
This paper introduces a False Negative Elimination strategy for image-text matching that improves training by better selecting negative samples, reducing the impact of false negatives, and enhancing retrieval performance.
Contribution
It proposes a novel sampling method using Bayes' rule and a memory buffer to effectively identify and exclude false negatives during training.
Findings
Outperforms existing methods on Flickr30K and MS-COCO datasets.
Reduces false negative impact, improving retrieval accuracy.
Demonstrates the effectiveness of the FNE strategy in image-text matching.
Abstract
Most existing image-text matching methods adopt triplet loss as the optimization objective, and choosing a proper negative sample for the triplet of <anchor, positive, negative> is important for effectively training the model, e.g., hard negatives make the model learn efficiently and effectively. However, we observe that existing methods mainly employ the most similar samples as hard negatives, which may not be true negatives. In other words, the samples with high similarity but not paired with the anchor may reserve positive semantic associations, and we call them false negatives. Repelling these false negatives in triplet loss would mislead the semantic representation learning and result in inferior retrieval performance. In this paper, we propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling, which could alleviate the problem introduced by false…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus · Triplet Loss
