Aligning Medical Images with General Knowledge from Large Language   Models

Xiao Fang; Yi Lin; Dong Zhang; Kwang-Ting Cheng; Hao Chen

arXiv:2409.00341·cs.CV·September 4, 2024

Aligning Medical Images with General Knowledge from Large Language Models

Xiao Fang, Yi Lin, Dong Zhang, Kwang-Ting Cheng, Hao Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces ViP, a framework that leverages large vision-language models like CLIP to improve medical image analysis by extracting visual symptoms and guiding prompt learning.

Contribution

The paper proposes a novel visual symptom-guided prompt learning framework that transfers knowledge from large language models to medical imaging tasks.

Findings

01

ViP outperforms state-of-the-art methods on two datasets.

02

The framework effectively extracts visual symptoms from language models.

03

ViP demonstrates strong generalization in medical image analysis.

Abstract

Pre-trained large vision-language models (VLMs) like CLIP have revolutionized visual representation learning using natural language as supervisions, and demonstrated promising generalization ability. In this work, we propose ViP, a novel visual symptom-guided prompt learning framework for medical image analysis, which facilitates general knowledge transfer from CLIP. ViP consists of two key components: a visual symptom generator (VSG) and a dual-prompt network. Specifically, VSG aims to extract explicable visual symptoms from pre-trained large language models, while the dual-prompt network utilizes these visual symptoms to guide the training on two learnable prompt modules, i.e., context prompt and merge prompt, which effectively adapts our framework to medical image analysis via large VLMs. Extensive experimental results demonstrate that ViP can outperform state-of-the-art methods on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaofang007/vip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Image Retrieval and Classification Techniques · Radiomics and Machine Learning in Medical Imaging

MethodsContrastive Language-Image Pre-training