Sample-Specific Debiasing for Better Image-Text Models
Peiqi Wang, Yingcheng Liu, Ching-Yun Ko, William M. Wells, Seth, Berkowitz, Steven Horng, Polina Golland

TL;DR
This paper introduces a novel sample-specific debiasing method for contrastive learning in image-text models, addressing false negatives caused by nonuniform class distributions, and demonstrates its effectiveness on medical image datasets.
Contribution
The paper proposes a new debiased contrastive learning approach that uses estimated class probabilities to correct false negatives, improving representation quality.
Findings
Empirical improvements on image and image-text datasets.
Theoretical analysis supports the effectiveness of the method.
Addresses false negatives in healthcare data with nonuniform class distributions.
Abstract
Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples that are treated as dissimilar but belong to the same class. In healthcare data, the underlying class distribution is nonuniform, implying that false negatives occur at a highly variable rate. To improve the quality of learned representations, we develop a novel approach that corrects for false negatives. Our method can be viewed as a variant of debiased contrastive learning that uses estimated sample-specific class probabilities. We provide theoretical analysis of the objective function and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning in Healthcare
MethodsContrastive Learning
