MaskMed: Decoupled Mask and Class Prediction for Medical Image Segmentation
Bin Xie, Gady Agam

TL;DR
MaskMed introduces a decoupled segmentation approach with a deformable transformer module, significantly improving medical image segmentation accuracy by enabling flexible feature sharing and full-scale spatial attention.
Contribution
The paper presents a novel decoupled segmentation head and a deformable transformer module, enhancing feature sharing and spatial alignment in medical image segmentation.
Findings
Surpasses nnUNet by +2.0% Dice on AMOS 2022
Achieves +6.9% Dice improvement on BTCV
Demonstrates state-of-the-art segmentation performance
Abstract
Medical image segmentation typically adopts a point-wise convolutional segmentation head to predict dense labels, where each output channel is heuristically tied to a specific class. This rigid design limits both feature sharing and semantic generalization. In this work, we propose a unified decoupled segmentation head that separates multi-class prediction into class-agnostic mask prediction and class label prediction using shared object queries. Furthermore, we introduce a Full-Scale Aware Deformable Transformer module that enables low-resolution encoder features to attend across full-resolution encoder features via deformable attention, achieving memory-efficient and spatially aligned full-scale fusion. Our proposed method, named MaskMed, achieves state-of-the-art performance, surpassing nnUNet by +2.0% Dice on AMOS 2022 and +6.9% Dice on BTCV.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Medical Imaging and Analysis
