MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li, Laurence T. Yang, Bocheng Ren, Xin Nie, Zhangyang Gao, Cheng, Tan, Stan Z. Li

TL;DR
MLIP introduces a domain-specific contrastive learning framework that integrates medical knowledge and multi-granularity image-text alignment to improve medical visual representations, especially with limited annotated data.
Contribution
It proposes a novel divergence encoder and knowledge-guided contrastive learning techniques to better utilize image-text information across multiple granularities in medical imaging.
Findings
Outperforms state-of-the-art methods in medical image classification, detection, and segmentation.
Enhances transfer performance with limited annotated data.
Demonstrates the effectiveness of multimodal pre-training in medical imaging.
Abstract
The scarcity of annotated data has sparked significant interest in unsupervised pre-training methods that leverage medical reports as auxiliary signals for medical visual representation learning. However, existing research overlooks the multi-granularity nature of medical visual representation and lacks suitable contrastive learning techniques to improve the models' generalizability across different granularities, leading to the underutilization of image-text information. To address this, we propose MLIP, a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning. Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Image Retrieval and Classification Techniques
MethodsContrastive Learning
