Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography
Yuexi Du, John Onofrey, Nicha C. Dvornek

TL;DR
This paper introduces MaMA, a novel multi-view and multi-scale alignment method for adapting CLIP to mammography, addressing data scarcity and high-resolution challenges, and achieving superior performance on real-world datasets.
Contribution
It presents a specialized supervision framework, a symmetric local alignment module, and a parameter-efficient fine-tuning approach for mammography CLIP adaptation.
Findings
Outperforms state-of-the-art baselines on multiple mammography tasks
Uses only 52% of the model size compared to the largest baseline
Effective in handling high-resolution images and data limitations
Abstract
Contrastive Language-Image Pre-training (CLIP) demonstrates strong potential in medical image analysis but requires substantial data and computational resources. Due to these restrictions, existing CLIP applications in medical imaging focus mainly on modalities like chest X-rays that have abundant image-report data available, leaving many other important modalities underexplored. Here, we propose one of the first adaptations of the full CLIP model to mammography, which presents significant challenges due to labeled data scarcity, high-resolution images with small regions of interest, and class-wise imbalance. We first develop a specialized supervision framework for mammography that leverages its multi-view nature. Furthermore, we design a symmetric local alignment module to better focus on detailed features in high-resolution images. Lastly, we incorporate a parameter-efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · AI in cancer detection · Lung Cancer Diagnosis and Treatment
MethodsFocus · Contrastive Language-Image Pre-training
