SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image   Segmentation

Yiqing Wang; Zihan Li; Jieru Mei; Zihao Wei; Li Liu; Chen Wang,; Shengtian Sang; Alan Yuille; Cihang Xie; Yuyin Zhou

arXiv:2307.12591·cs.CV·July 25, 2023·1 cites

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

Yiqing Wang, Zihan Li, Jieru Mei, Zihao Wei, Li Liu, Chen Wang,, Shengtian Sang, Alan Yuille, Cihang Xie, Yuyin Zhou

PDF

Open Access 1 Repo

TL;DR

SwinMM introduces a multi-view, self-supervised learning approach using Swin Transformers to improve 3D medical image segmentation accuracy and data efficiency, especially when limited pre-training data is available.

Contribution

The paper proposes a novel masked multi-view encoder and cross-view decoder framework that leverages diverse proxy tasks and mutual learning for enhanced medical image segmentation.

Findings

01

Outperforms previous state-of-the-art methods like Swin UNETR.

02

Improves segmentation accuracy and data efficiency.

03

Effective multi-view information integration demonstrated.

Abstract

Recent advancements in large-scale Vision Transformers have made significant strides in improving pre-trained models for medical image segmentation. However, these methods face a notable challenge in acquiring a substantial amount of pre-training data, particularly within the medical field. To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for enabling accurate and data-efficient self-supervised medical image analysis. Our strategy harnesses the potential of multi-view information by incorporating two principal components. In the pre-training phase, we deploy a masked multi-view encoder devised to concurrently train masked multi-view observations through a range of diverse proxy tasks. These tasks span image reconstruction, rotation, contrastive learning, and a novel task that employs a mutual learning paradigm. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ucsc-vlaa/swinmm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Radiomics and Machine Learning in Medical Imaging

MethodsMulti-Head Attention · Attention Is All You Need · Concatenated Skip Connection · Max Pooling · U-Net · Linear Layer · Softmax · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution