A Vision-Language Model for Focal Liver Lesion Classification
Song Jian, Hu Yuchang, Wang Hui, Chen Yen-Wei

TL;DR
This paper introduces Liver-VLM, a vision-language model tailored for classifying focal liver lesions, leveraging multimodal learning to improve accuracy with limited labeled medical data.
Contribution
Liver-VLM integrates class information into the text encoder and aligns image and text features, outperforming existing models in liver lesion classification.
Findings
Liver-VLM surpasses CLIP and MedCLIP in accuracy and AUC.
Using ResNet18 backbone improves performance under limited data.
Model effectively aligns image features with class-level text features.
Abstract
Accurate classification of focal liver lesions is crucial for diagnosis and treatment in hepatology. However, traditional supervised deep learning models depend on large-scale annotated datasets, which are often limited in medical imaging. Recently, Vision-Language models (VLMs) such as Contrastive Language-Image Pre-training model (CLIP) has been applied to image classifications. Compared to the conventional convolutional neural network (CNN), which classifiers image based on visual information only, VLM leverages multimodal learning with text and images, allowing it to learn effectively even with a limited amount of labeled data. Inspired by CLIP, we pro-pose a Liver-VLM, a model specifically designed for focal liver lesions (FLLs) classification. First, Liver-VLM incorporates class information into the text encoder without introducing additional inference overhead. Second, by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Smart Systems and Machine Learning · Artificial Intelligence in Healthcare
MethodsContrastive Language-Image Pre-training
