LLM-Guided Diagnostic Evidence Alignment for Medical Vision-Language Pretraining under Limited Pairing
Huimin Yan, Liang Bai, Xian Yang, Long Chen

TL;DR
This paper introduces LGDEA, a novel medical vision-language pretraining method that uses LLMs to align diagnostic evidence across modalities, reducing dependence on paired data and improving performance in medical image analysis tasks.
Contribution
The paper proposes evidence-level alignment guided by LLMs, enabling effective use of unpaired data and improving diagnostic representation learning in medical vision-language pretraining.
Findings
Significant improvements in phrase grounding, image-text retrieval, and zero-shot classification.
Rivals methods relying on large amounts of paired data.
Effective exploitation of unpaired medical images and reports.
Abstract
Most existing CLIP-style medical vision--language pretraining methods rely on global or local alignment with substantial paired data. However, global alignment is easily dominated by non-diagnostic information, while local alignment fails to integrate key diagnostic evidence. As a result, learning reliable diagnostic representations becomes difficult, which limits their applicability in medical scenarios with limited paired data. To address this issue, we propose an LLM-Guided Diagnostic Evidence Alignment method (LGDEA), which shifts the pretraining objective toward evidence-level alignment that is more consistent with the medical diagnostic process. Specifically, we leverage LLMs to extract key diagnostic evidence from radiology reports and construct a shared diagnostic evidence space, enabling evidence-aware cross-modal alignment and allowing LGDEA to effectively exploit abundant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
