Multi-scale Hierarchical Vision Transformer with Cascaded Attention   Decoding for Medical Image Segmentation

Md Mostafijur Rahman; Radu Marculescu

arXiv:2303.16892·cs.CV·March 30, 2023·28 cites

Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Radu Marculescu

PDF

Open Access 1 Repo

TL;DR

This paper introduces MERIT, a multi-scale hierarchical vision transformer with cascaded attention decoding and a novel loss aggregation, significantly improving medical image segmentation performance and generalization over existing methods.

Contribution

The paper proposes a multi-scale hierarchical transformer backbone and a cascaded attention decoder, along with a new loss aggregation method, enhancing segmentation accuracy and model robustness.

Findings

01

MERIT outperforms state-of-the-art methods on benchmark datasets.

02

The multi-scale approach improves generalization in medical image segmentation.

03

The MUTATION loss enhances training efficiency and model performance.

Abstract

Transformers have shown great success in medical image segmentation. However, transformers may exhibit a limited generalization ability due to the underlying single-scale self-attention (SA) mechanism. In this paper, we address this issue by introducing a Multi-scale hiERarchical vIsion Transformer (MERIT) backbone network, which improves the generalizability of the model by computing SA at multiple scales. We also incorporate an attention-based decoder, namely Cascaded Attention Decoding (CASCADE), for further refinement of multi-stage features generated by MERIT. Finally, we introduce an effective multi-stage feature mixing loss aggregation (MUTATION) method for better model training via implicit ensembling. Our experiments on two widely used medical image segmentation benchmarks (i.e., Synapse Multi-organ, ACDC) demonstrate the superior performance of MERIT over state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SLDGroup/MERIT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Medical Image Segmentation Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Dense Connections · Layer Normalization · Linear Layer · Adam · Softmax · Residual Connection · Label Smoothing