Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using   Domain-Specific Foundation Models

Jakob Krogh Petersen; Valdemar Licht; Mads Nielsen; Asbj{\o}rn Munk

arXiv:2501.14051·cs.CV·January 27, 2025

Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models

Jakob Krogh Petersen, Valdemar Licht, Mads Nielsen, Asbj{\o}rn Munk

PDF

Open Access 1 Repo

TL;DR

This paper introduces a domain-specific 3D foundation model that aligns MRI and tabular data using a CLIP-inspired approach, requiring only 62 MRI scans and a novel embedding accumulation strategy.

Contribution

It demonstrates the feasibility of aligning 3D MRI and tabular data with limited samples through a simple embedding accumulation method and thorough evaluation.

Findings

01

Effective modality alignment with only 62 MRI scans

02

Successful zero-shot classification of MRI and tabular data

03

Challenges remain in zero-shot image retrieval

Abstract

Multi-modal models require aligned, shared embedding spaces. However, common CLIP-based approaches need large amounts of samples and do not natively support 3D or tabular data, both of which are crucial in the medical domain. To address these issues, we revisit CLIP-style alignment by training a domain-specific 3D foundation model as an image encoder and demonstrate that modality alignment is feasible with only 62 MRI scans. Our approach is enabled by a simple embedding accumulation strategy required for training in 3D, which scales the amount of negative pairs across batches in order to stabilize training. We perform a thorough evaluation of various design choices, including the choice of backbone and loss functions, and evaluate the proposed methodology on zero-shot classification and image-retrieval tasks. While zero-shot image-retrieval remains challenging, zero-shot classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jakekrogh/3d-clip-for-brain-mri
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques

MethodsALIGN