CSWin-UNet: Transformer UNet with Cross-Shaped Windows for Medical Image   Segmentation

Xiao Liu; Peng Gao; Tao Yu; Fei Wang; Ru-Yue Yuan

arXiv:2407.18070·eess.IV·September 20, 2024

CSWin-UNet: Transformer UNet with Cross-Shaped Windows for Medical Image Segmentation

Xiao Liu, Peng Gao, Tao Yu, Fei Wang, Ru-Yue Yuan

PDF

1 Repo

TL;DR

CSWin-UNet is a novel Transformer-based U-shaped segmentation model that improves efficiency and accuracy in medical image segmentation by integrating cross-shaped window self-attention and a content-aware reassembly decoder.

Contribution

It introduces a new self-attention mechanism with horizontal and vertical stripes and a content-aware decoder, enhancing segmentation performance with low model complexity.

Findings

01

Achieves high segmentation accuracy across multiple datasets.

02

Maintains low computational complexity compared to other Transformer models.

03

Demonstrates superior performance in diverse medical imaging scenarios.

Abstract

Deep learning, especially convolutional neural networks (CNNs) and Transformer architectures, have become the focus of extensive research in medical image segmentation, achieving impressive results. However, CNNs come with inductive biases that limit their effectiveness in more complex, varied segmentation scenarios. Conversely, while Transformer-based methods excel at capturing global and long-range semantic details, they suffer from high computational demands. In this study, we propose CSWin-UNet, a novel U-shaped segmentation method that incorporates the CSWin self-attention mechanism into the UNet to facilitate horizontal and vertical stripes self-attention. This method significantly enhances both computational efficiency and receptive field interactions. Additionally, our innovative decoder utilizes a content-aware reassembly operator that strategically reassembles features, guided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eatbeanss/CSWin-UNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsByte Pair Encoding · Layer Normalization · Focus · Label Smoothing · Linear Layer · Softmax · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention