MedFILIP: Medical Fine-grained Language-Image Pre-training

Xinjie Liang; Xiangyu Li; Fanding Li; Jie Jiang; Qing Dong; Wei Wang,; Kuanquan Wang; Suyu Dong; Gongning Luo; Shuo Li

arXiv:2501.10775·cs.CV·January 22, 2025

MedFILIP: Medical Fine-grained Language-Image Pre-training

Xinjie Liang, Xiangyu Li, Fanding Li, Jie Jiang, Qing Dong, Wei Wang,, Kuanquan Wang, Suyu Dong, Gongning Luo, Shuo Li

PDF

Open Access 1 Repo

TL;DR

MedFILIP is a novel medical vision-language pretraining model that leverages fine-grained disease details, knowledge injection, and semantic similarity to improve diagnostic accuracy across multiple datasets.

Contribution

It introduces a new fine-grained VLP approach with disease-specific knowledge extraction, knowledge injection, and enhanced image-text alignment for medical imaging.

Findings

01

Achieves state-of-the-art performance on multiple datasets.

02

Improves classification accuracy by up to 6.69%.

03

Effectively models disease details and relationships in medical images.

Abstract

Medical vision-language pretraining (VLP) that leverages naturally-paired medical image-report data is crucial for medical image analysis. However, existing methods struggle to accurately characterize associations between images and diseases, leading to inaccurate or incomplete diagnostic results. In this work, we propose MedFILIP, a fine-grained VLP model, introduces medical image-specific knowledge through contrastive learning, specifically: 1) An information extractor based on a large language model is proposed to decouple comprehensive disease details from reports, which excels in extracting disease deals through flexible prompt engineering, thereby effectively reducing text complexity while retaining rich information at a tiny cost. 2) A knowledge injector is proposed to construct relationships between categories and visual attributes, which help the model to make judgments based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

perceptioncomputinglab/medfilip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging