STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multi-scale MLP for Medical Image Segmentation
Lei Shi, Tianyu Gao, Zheng Zhang, Junxing Zhang

TL;DR
This paper introduces STM-UNet, a novel U-shaped architecture combining Swin Transformer and multi-scale MLP, which improves global feature modeling and segmentation accuracy in medical images while maintaining low complexity.
Contribution
The paper proposes a new U-shaped architecture integrating Swin Transformer and a novel PCAS-MLP for enhanced multi-scale feature extraction and segmentation performance.
Findings
Outperforms state-of-the-art methods in IoU and Dice scores.
Demonstrates effective global feature modeling with Swin Transformer.
Achieves a good balance between accuracy and model complexity.
Abstract
Automated medical image segmentation can assist doctors to diagnose faster and more accurate. Deep learning based models for medical image segmentation have made great progress in recent years. However, the existing models fail to effectively leverage Transformer and MLP for improving U-shaped architecture efficiently. In addition, the multi-scale features of the MLP have not been fully extracted in the bottleneck of U-shaped architecture. In this paper, we propose an efficient U-shaped architecture based on Swin Transformer and multi-scale MLP, namely STM-UNet. Specifically, the Swin Transformer block is added to skip connection of STM-UNet in form of residual connection, which can enhance the modeling ability of global features and long-range dependency. Meanwhile, a novel PCAS-MLP with parallel convolution module is designed and placed into the bottleneck of our architecture to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Brain Tumor Detection and Classification
MethodsAttention Is All You Need · fail · Adam · Label Smoothing · Dropout · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Convolution · Softmax · Stochastic Depth
