TL;DR
This paper introduces DSVM-UNet, a simple dual self-distillation method that enhances VM-UNet for medical image segmentation, achieving state-of-the-art results without complex architectural changes.
Contribution
The paper proposes a novel dual self-distillation approach to improve VM-UNet's performance efficiently, avoiding complex architectural modifications.
Findings
Achieves state-of-the-art performance on ISIC2017, ISIC2018, and Synapse benchmarks.
Maintains computational efficiency while improving segmentation accuracy.
Effective feature alignment at global and local levels through dual self-distillation.
Abstract
Vision Mamba models have been extensively researched in various fields, which address the limitations of previous models by effectively managing long-range dependencies with a linear-time overhead. Several prospective studies have further designed Vision Mamba based on UNet(VM-UNet) for medical image segmentation. These approaches primarily focus on optimizing architectural designs by creating more complex structures to enhance the model's ability to perceive semantic features. In this paper, we propose a simple yet effective approach to improve the model by Dual Self-distillation for VM-UNet (DSVM-UNet) without any complex architectural designs. To achieve this goal, we develop double self-distillation methods to align the features at both the global and local levels. Extensive experiments conducted on the ISIC2017, ISIC2018, and Synapse benchmarks demonstrate that our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
