SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation
Yiqing Wang, Zihan Li, Jieru Mei, Zihao Wei, Li Liu, Chen Wang,, Shengtian Sang, Alan Yuille, Cihang Xie, Yuyin Zhou

TL;DR
SwinMM introduces a multi-view, self-supervised learning approach using Swin Transformers to improve 3D medical image segmentation accuracy and data efficiency, especially when limited pre-training data is available.
Contribution
The paper proposes a novel masked multi-view encoder and cross-view decoder framework that leverages diverse proxy tasks and mutual learning for enhanced medical image segmentation.
Findings
Outperforms previous state-of-the-art methods like Swin UNETR.
Improves segmentation accuracy and data efficiency.
Effective multi-view information integration demonstrated.
Abstract
Recent advancements in large-scale Vision Transformers have made significant strides in improving pre-trained models for medical image segmentation. However, these methods face a notable challenge in acquiring a substantial amount of pre-training data, particularly within the medical field. To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for enabling accurate and data-efficient self-supervised medical image analysis. Our strategy harnesses the potential of multi-view information by incorporating two principal components. In the pre-training phase, we deploy a masked multi-view encoder devised to concurrently train masked multi-view observations through a range of diverse proxy tasks. These tasks span image reconstruction, rotation, contrastive learning, and a novel task that employs a mutual learning paradigm. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Radiomics and Machine Learning in Medical Imaging
MethodsMulti-Head Attention · Attention Is All You Need · Concatenated Skip Connection · Max Pooling · U-Net · Linear Layer · Softmax · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution
