FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Alignment
Myunsoo Kim, Seongwoong Shim, Byung-Jun Lee

TL;DR
FALCON introduces a dynamic negative mining strategy for vision-language pretraining that adaptively balances hard negatives and false negatives, significantly improving embedding quality and downstream task performance.
Contribution
The paper presents FALCON, a novel negative mining scheduler that adaptively selects negatives during training, addressing false negatives in vision-language alignment.
Findings
Improves performance across multiple VLP frameworks.
Enhances downstream task accuracy.
Robustly mitigates false negatives effects.
Abstract
False negatives pose a critical challenge in vision-language pretraining (VLP) due to the many-to-many correspondence between images and texts in large-scale datasets. These false negatives introduce conflicting supervision signals that degrade the learned embedding space and diminish the effectiveness of hard negative sampling. In this paper, we propose FALCON (False-negative Aware Learning of COntrastive Negatives), a learning-based mini-batch construction strategy that adaptively balances the trade-off between hard and false negatives during VLP. Rather than relying on fixed heuristics, FALCON employs a negative mining scheduler that dynamically selects negative samples of appropriate hardness for each anchor instance during mini-batch construction, guided by a proxy for cross-modal alignment improvement. Experimental results demonstrate that FALCON significantly improves performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
MethodsAttentive Walk-Aggregating Graph Neural Network
