Medical Transformer: Gated Axial-Attention for Medical Image   Segmentation

Jeya Maria Jose Valanarasu; Poojan Oza; Ilker Hacihaliloglu; Vishal M.; Patel

arXiv:2102.10662·cs.CV·July 8, 2021

Medical Transformer: Gated Axial-Attention for Medical Image Segmentation

Jeya Maria Jose Valanarasu, Poojan Oza, Ilker Hacihaliloglu, Vishal M., Patel

PDF

2 Repos

TL;DR

This paper introduces Medical Transformer, a novel gated axial-attention model with a local-global training strategy, effectively capturing long-range dependencies in medical images for improved segmentation performance, especially with limited data.

Contribution

The paper proposes a Gated Axial-Attention model and a Local-Global training strategy to enhance transformer-based medical image segmentation with limited datasets.

Findings

01

Outperforms existing convolutional and transformer models on three datasets.

02

Effective learning of global and local features through the proposed training strategy.

03

Demonstrates feasibility of transformer architectures in medical imaging with limited data.

Abstract

Over the past decade, Deep Convolutional Neural Networks have been widely adopted for medical image segmentation and shown to achieve adequate performance. However, due to the inherent inductive biases present in the convolutional architectures, they lack understanding of long-range dependencies in the image. Recently proposed Transformer-based architectures that leverage self-attention mechanism encode long-range dependencies and learn representations that are highly expressive. This motivates us to explore Transformer-based solutions and study the feasibility of using Transformer-based network architectures for medical image segmentation tasks. Majority of existing Transformer-based network architectures proposed for vision applications require large-scale datasets to train properly. However, compared to the datasets for vision applications, for medical imaging the number of data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Layer Normalization · Attention Is All You Need · Dense Connections · Softmax · Adam