Self-Supervised Modality-Agnostic Pre-Training of Swin Transformers

Abhiroop Talasila; Maitreya Maity; U. Deva Priyakumar

arXiv:2405.12781·cs.CV·May 22, 2024

Self-Supervised Modality-Agnostic Pre-Training of Swin Transformers

Abhiroop Talasila, Maitreya Maity, U. Deva Priyakumar

PDF

Open Access 1 Repo

TL;DR

This paper introduces SwinFUSE, a modality-agnostic pre-training method for Swin Transformers that learns from multiple medical imaging modalities, improving generalization and out-of-distribution performance in 3D segmentation tasks.

Contribution

It proposes a novel multi-modal pre-training framework with a domain-invariance module, enabling Swin Transformers to effectively learn from diverse medical imaging modalities.

Findings

01

Achieves 1-2% performance trade-off on in-distribution data.

02

Surpasses single-modality models by up to 27% on out-of-distribution data.

03

Demonstrates strong generalizability across different medical imaging tasks.

Abstract

Unsupervised pre-training has emerged as a transformative paradigm, displaying remarkable advancements in various domains. However, the susceptibility to domain shift, where pre-training data distribution differs from fine-tuning, poses a significant obstacle. To address this, we augment the Swin Transformer to learn from different medical imaging modalities, enhancing downstream performance. Our model, dubbed SwinFUSE (Swin Multi-Modal Fusion for UnSupervised Enhancement), offers three key advantages: (i) it learns from both Computed Tomography (CT) and Magnetic Resonance Images (MRI) during pre-training, resulting in complementary feature representations; (ii) a domain-invariance module (DIM) that effectively highlights salient input regions, enhancing adaptability; (iii) exhibits remarkable generalizability, surpassing the confines of tasks it was initially pre-trained on. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

devalab/swinfuse
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetallurgy and Material Forming

MethodsLinear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Stochastic Depth · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout · Softmax