Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi, Tian, Manning Wang

TL;DR
Swin-Unet introduces a pure Transformer-based U-shaped architecture for medical image segmentation, effectively capturing global context and outperforming traditional CNN-based methods on multiple tasks.
Contribution
This paper presents Swin-Unet, a novel pure Transformer architecture with hierarchical Swin Transformer modules for improved medical image segmentation.
Findings
Outperforms CNN-based and hybrid models on multi-organ segmentation
Effective global and local feature learning through hierarchical Swin Transformer
Achieves superior accuracy with 4x down-sampling and up-sampling
Abstract
In the past few years, convolutional neural networks (CNNs) have achieved milestones in medical image analysis. Especially, the deep neural networks based on U-shaped architecture and skip-connections have been widely applied in a variety of medical image tasks. However, although CNN has achieved excellent performance, it cannot learn global and long-range semantic information interaction well due to the locality of the convolution operation. In this paper, we propose Swin-Unet, which is an Unet-like pure Transformer for medical image segmentation. The tokenized image patches are fed into the Transformer-based U-shaped Encoder-Decoder architecture with skip-connections for local-global semantic feature learning. Specifically, we use hierarchical Swin Transformer with shifted windows as the encoder to extract context features. And a symmetric Swin Transformer-based decoder with patch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Radiomics and Machine Learning in Medical Imaging · COVID-19 diagnosis using AI
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Stochastic Depth · Swin Transformer · Dropout · Dense Connections · Adam · Layer Normalization
