ModAn-MulSupCon: Modality-and Anatomy-Aware Multi-Label Supervised Contrastive Pretraining for Medical Imaging
Eichi Takaya, Ryusei Inamori

TL;DR
ModAn-MulSupCon introduces a novel pretraining method leveraging modality and anatomy metadata in medical images, improving transferability and downstream classification performance, especially when fine-tuning is possible.
Contribution
It proposes a modality- and anatomy-aware multi-label supervised contrastive pretraining approach that effectively utilizes metadata to enhance medical imaging representations.
Findings
Achieved state-of-the-art AUC on knee MRI and thyroid ultrasound tasks.
Outperformed baselines in fine-tuning scenarios, with significant improvements.
Representations benefit most from task adaptation rather than linear separability.
Abstract
Background and objective: Expert annotations limit large-scale supervised pretraining in medical imaging, while ubiquitous metadata (modality, anatomical region) remain underused. We introduce ModAn-MulSupCon, a modality- and anatomy-aware multi-label supervised contrastive pretraining method that leverages such metadata to learn transferable representations. Method: Each image's modality and anatomy are encoded as a multi-hot vector. A ResNet-18 encoder is pretrained on a mini subset of RadImageNet (miniRIN, 16,222 images) with a Jaccard-weighted multi-label supervised contrastive loss, and then evaluated by fine-tuning and linear probing on three binary classification tasks--ACL tear (knee MRI), lesion malignancy (breast ultrasound), and nodule malignancy (thyroid ultrasound). Result: With fine-tuning, ModAn-MulSupCon achieved the best AUC on MRNet-ACL (0.964) and Thyroid (0.763),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
