KPL: Training-Free Medical Knowledge Mining of Vision-Language Models

Jiaxiang Liu; Tianxiang Hu; Jiawei Du; Ruiyuan Zhang; Joey Tianyi; Zhou; Zuozhu Liu

arXiv:2501.11231·cs.CV·January 22, 2025

KPL: Training-Free Medical Knowledge Mining of Vision-Language Models

Jiaxiang Liu, Tianxiang Hu, Jiawei Du, Ruiyuan Zhang, Joey Tianyi, Zhou, Zuozhu Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces KPL, a novel method that enhances zero-shot medical image classification by mining and leveraging knowledge from CLIP, addressing class representation and modal gap challenges.

Contribution

KPL is a training-free approach that enriches semantic proxies with knowledge descriptions and uses multimodal proxy learning to improve medical image classification performance.

Findings

01

KPL outperforms baseline methods on medical datasets.

02

KPL demonstrates stability and effectiveness in zero-shot classification.

03

KPL also shows improvements on natural image datasets.

Abstract

Visual Language Models such as CLIP excel in image recognition due to extensive image-text pre-training. However, applying the CLIP inference in zero-shot classification, particularly for medical image diagnosis, faces challenges due to: 1) the inadequacy of representing image classes solely with single category names; 2) the modal gap between the visual and text spaces generated by CLIP encoders. Despite attempts to enrich disease descriptions with large language models, the lack of class-specific knowledge often leads to poor performance. In addition, empirical evidence suggests that existing proxy learning methods for zero-shot image classification on natural image datasets exhibit instability when applied to medical datasets. To tackle these challenges, we introduce the Knowledge Proxy Learning (KPL) to mine knowledge from CLIP. KPL is designed to leverage CLIP's multimodal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jxliu-ai/kpl
pytorchOfficial

Videos

KPL: Training-Free Medical Knowledge Mining of Vision-Language Models· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Multimodal Machine Learning Applications · Semantic Web and Ontologies

MethodsContrastive Language-Image Pre-training · Balanced Selection