Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation
Md Mostafijur Rahman, Radu Marculescu

TL;DR
This paper introduces MERIT, a multi-scale hierarchical vision transformer with cascaded attention decoding and a novel loss aggregation, significantly improving medical image segmentation performance and generalization over existing methods.
Contribution
The paper proposes a multi-scale hierarchical transformer backbone and a cascaded attention decoder, along with a new loss aggregation method, enhancing segmentation accuracy and model robustness.
Findings
MERIT outperforms state-of-the-art methods on benchmark datasets.
The multi-scale approach improves generalization in medical image segmentation.
The MUTATION loss enhances training efficiency and model performance.
Abstract
Transformers have shown great success in medical image segmentation. However, transformers may exhibit a limited generalization ability due to the underlying single-scale self-attention (SA) mechanism. In this paper, we address this issue by introducing a Multi-scale hiERarchical vIsion Transformer (MERIT) backbone network, which improves the generalizability of the model by computing SA at multiple scales. We also incorporate an attention-based decoder, namely Cascaded Attention Decoding (CASCADE), for further refinement of multi-stage features generated by MERIT. Finally, we introduce an effective multi-stage feature mixing loss aggregation (MUTATION) method for better model training via implicit ensembling. Our experiments on two widely used medical image segmentation benchmarks (i.e., Synapse Multi-organ, ACDC) demonstrate the superior performance of MERIT over state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Medical Image Segmentation Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Dense Connections · Layer Normalization · Linear Layer · Adam · Softmax · Residual Connection · Label Smoothing
