Unleashing Diffusion and State Space Models for Medical Image Segmentation

Rong Wu; Ziqi Chen; Liming Zhong; Heng Li; Hai Shu

arXiv:2506.12747·cs.CV·July 2, 2025

Unleashing Diffusion and State Space Models for Medical Image Segmentation

Rong Wu, Ziqi Chen, Liming Zhong, Heng Li, Hai Shu

PDF

Open Access

TL;DR

This paper introduces DSM, a novel framework combining diffusion and state space models to improve medical image segmentation, especially for unseen tumors, by leveraging object queries, diffusion prompts, and CLIP embeddings for enhanced robustness.

Contribution

DSM is the first to integrate diffusion and state space models with object queries and CLIP embeddings for robust unseen tumor segmentation in medical imaging.

Findings

01

DSM outperforms existing models in unseen tumor segmentation tasks.

02

The model achieves higher accuracy and robustness across diverse datasets.

03

Diffusion-guided feature fusion enhances semantic segmentation performance.

Abstract

Existing segmentation models trained on a single medical imaging dataset often lack robustness when encountering unseen organs or tumors. Developing a robust model capable of identifying rare or novel tumor categories not present during training is crucial for advancing medical imaging applications. We propose DSM, a novel framework that leverages diffusion and state space models to segment unseen tumor categories beyond the training data. DSM utilizes two sets of object queries trained within modified attention decoders to enhance classification accuracy. Initially, the model learns organ queries using an object-aware feature grouping strategy to capture organ-level visual features. It then refines tumor queries by focusing on diffusion-based visual prompts, enabling precise segmentation of previously unseen tumors. Furthermore, we incorporate diffusion-guided feature fusion to improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications