Medical Slice Transformer: Improved Diagnosis and Explainability on 3D Medical Images with DINOv2
Gustav M\"uller-Franzes, Firas Khader, Robert Siepmann, Tianyu Han, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn

TL;DR
This paper introduces the Medical Slice Transformer (MST), a novel framework that adapts 2D self-supervised models like DINOv2 for 3D medical imaging, improving diagnostic accuracy and explainability across multiple clinical datasets.
Contribution
The study presents MST, a new method combining Transformer architecture with 2D feature extractors to enhance 3D medical image analysis, outperforming traditional 3D CNNs in accuracy and interpretability.
Findings
MST achieved higher AUC scores than 3D ResNet across all datasets.
Saliency maps from MST were more precise and anatomically correct.
MST demonstrated improved diagnostic performance and explainability.
Abstract
MRI and CT are essential clinical cross-sectional imaging techniques for diagnosing complex conditions. However, large 3D datasets with annotations for deep learning are scarce. While methods like DINOv2 are encouraging for 2D image analysis, these methods have not been applied to 3D medical images. Furthermore, deep learning models often lack explainability due to their "black-box" nature. This study aims to extend 2D self-supervised models, specifically DINOv2, to 3D medical imaging while evaluating their potential for explainable outcomes. We introduce the Medical Slice Transformer (MST) framework to adapt 2D self-supervised models for 3D medical image analysis. MST combines a Transformer architecture with a 2D feature extractor, i.e., DINOv2. We evaluate its diagnostic performance against a 3D convolutional neural network (3D ResNet) across three clinical datasets: breast MRI (651…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging · Brain Tumor Detection and Classification
MethodsAttention Is All You Need · Dense Connections · Label Smoothing · Dropout · Linear Layer · Average Pooling · Convolution · Layer Normalization · Byte Pair Encoding · Adam
