TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan, Wang, Le Lu, Alan L. Yuille, Yuyin Zhou

TL;DR
TransUNet combines Transformers and U-Net architectures to improve global context understanding and localization in medical image segmentation, achieving superior results across multiple medical imaging tasks.
Contribution
The paper introduces TransUNet, a novel architecture that integrates Transformers as encoders with U-Net for enhanced medical image segmentation.
Findings
Outperforms existing methods on multi-organ segmentation
Achieves superior accuracy in cardiac segmentation
Effectively combines global context with local details
Abstract
Medical image segmentation is an essential prerequisite for developing healthcare systems, especially for disease diagnosis and treatment planning. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard and achieved tremendous success. However, due to the intrinsic locality of convolution operations, U-Net generally demonstrates limitations in explicitly modeling long-range dependency. Transformers, designed for sequence-to-sequence prediction, have emerged as alternative architectures with innate global self-attention mechanisms, but can result in limited localization abilities due to insufficient low-level details. In this paper, we propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation. On one hand, the Transformer encodes tokenized image patches from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Radiomics and Machine Learning in Medical Imaging · COVID-19 diagnosis using AI
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Attention Is All You Need · Dense Connections · Byte Pair Encoding · Softmax · Dropout · Concatenated Skip Connection · Label Smoothing
