Contextual Attention Network: Transformer Meets U-Net
Reza Azad, Moein Heidari, Yuli Wu, Dorit Merhof

TL;DR
This paper introduces a novel neural network that combines Transformer and U-Net architectures to improve medical image segmentation by capturing both long-range dependencies and local details, achieving state-of-the-art results.
Contribution
The proposed contextual attention network effectively integrates Transformer-based long-range context modeling with CNN local feature extraction, enhancing segmentation accuracy.
Findings
Achieved state-of-the-art performance on multiple medical image datasets.
Effectively models long-range dependencies and local details.
Improves boundary delineation in segmentation tasks.
Abstract
Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution operation. Hence, recent articles have exploited Transformer variants for medical image segmentation tasks which open up great opportunities due to their innate capability of capturing long-range correlations through the attention mechanism. Although being feasibly designed, most of the cohort studies incur prohibitive performance in capturing local information, thereby resulting in less lucidness of boundary areas. In this paper, we propose a contextual attention network to tackle the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · COVID-19 diagnosis using AI · AI in cancer detection
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Dense Connections · Residual Connection · Label Smoothing · Softmax · Absolute Position Encodings · Position-Wise Feed-Forward Layer
