VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis

Zhihan Ju; Wanting Zhou

arXiv:2405.05667·eess.IV·May 10, 2024·6 cites

VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis

Zhihan Ju, Wanting Zhou

PDF

Open Access

TL;DR

VM-DDPM introduces a novel hybrid architecture combining CNN and State Space Model for medical image synthesis, achieving state-of-the-art results with linear complexity and improved structural texture consistency.

Contribution

The paper presents the first medical image synthesis model based on a hybrid SSM-CNN architecture, integrating multi-level feature extraction and a plug-and-play sequence regeneration strategy.

Findings

01

Achieves state-of-the-art performance on multiple datasets

02

Maintains linear computational complexity

03

Qualitative evaluation by radiologists confirms quality

Abstract

In the realm of smart healthcare, researchers enhance the scale and diversity of medical datasets through medical image synthesis. However, existing methods are limited by CNN local perception and Transformer quadratic complexity, making it difficult to balance structural texture consistency. To this end, we propose the Vision Mamba DDPM (VM-DDPM) based on State Space Model (SSM), fully combining CNN local perception and SSM global modeling capabilities, while maintaining linear computational complexity. Specifically, we designed a multi-level feature extraction module called Multi-level State Space Block (MSSBlock), and a basic unit of encoder-decoder structure called State Space Layer (SSLayer) for medical pathological images. Besides, we designed a simple, Plug-and-Play, zero-parameter Sequence Regeneration strategy for the Cross-Scan Module (CSM), which enabled the S6 module to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques

MethodsAttention Is All You Need · Dropout · Label Smoothing · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Layer · Byte Pair Encoding · Adam