Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko; Chang-Min Park

arXiv:2505.22079·cs.CV·May 29, 2025

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang-Min Park

PDF

Open Access

TL;DR

This paper enhances medical vision-language models by integrating dynamic soft labels, negation-aware learning, and graphical alignment, significantly improving clinical language understanding and performance in medical imaging tasks.

Contribution

It introduces a novel framework combining dynamic soft labels, negation-based negatives, and graphical alignment to improve medical CLIP models' understanding of clinical language.

Findings

01

Achieved state-of-the-art results in zero-shot and fine-tuned classification.

02

Demonstrated improved understanding of negation and clinical details.

03

Generalized across multiple contrastive learning frameworks.

Abstract

The development of large-scale image-text pair datasets has significantly advanced self-supervised learning in Vision-Language Processing (VLP). However, directly applying general-domain architectures such as CLIP to medical data presents challenges, particularly in handling negations and addressing the inherent data imbalance of medical datasets. To address these issues, we propose a novel approach that integrates clinically-enhanced dynamic soft labels and medical graphical alignment, thereby improving clinical comprehension and the applicability of contrastive loss in medical contexts. Furthermore, we introduce negation-based hard negatives to deepen the model's understanding of the complexities of clinical language. Our approach is easily integrated into the medical CLIP training pipeline and achieves state-of-the-art performance across multiple tasks, including zero-shot,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification