DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation
Ailiang Lin, Bingzhi Chen, Jiayu Xu, Zheng Zhang, Guangming Lu

TL;DR
This paper introduces DS-TransUNet, a novel dual Swin Transformer-based U-Net architecture that effectively captures long-range dependencies and structural features for improved medical image segmentation.
Contribution
It is the first to integrate hierarchical Swin Transformers into both encoder and decoder of U-Net for enhanced segmentation performance.
Findings
Significantly outperforms state-of-the-art methods across four medical segmentation tasks.
Uses dual-scale encoders to capture coarse and fine features effectively.
Introduces Transformer Interactive Fusion for global feature dependencies.
Abstract
Automatic medical image segmentation has made great progress benefit from the development of deep learning. However, most existing methods are based on convolutional neural networks (CNNs), which fail to build long-range dependencies and global context connections due to the limitation of receptive field in convolution operation. Inspired by the success of Transformer in modeling the long-range contextual information, some researchers have expended considerable efforts in designing the robust variants of Transformer-based U-Net. Moreover, the patch division used in vision transformers usually ignores the pixel-level intrinsic structural features inside each patch. To alleviate these problems, we propose a novel deep medical image segmentation framework called Dual Swin Transformer U-Net (DS-TransUNet), which might be the first attempt to concurrently incorporate the advantages of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Stochastic Depth · Swin Transformer · Concatenated Skip Connection · Byte Pair Encoding · Adam
