FermatSyn: SAM2-Enhanced Bidirectional Mamba with Isotropic Spiral Scanning for Multi-Modal Medical Image Synthesis
Feng Yuan

TL;DR
FermatSyn introduces a novel multi-modal medical image synthesis framework that combines domain-aware anatomical priors, hierarchical feature fusion, and an isotropic spiral scanning strategy to improve global consistency and local detail fidelity.
Contribution
The paper presents FermatSyn, integrating SAM2-based priors, a hierarchical residual downsampling module, and a Fermat spiral scanning approach for enhanced multi-modal medical image synthesis.
Findings
Outperforms state-of-the-art in PSNR, SSIM, FID, and 3D consistency.
Synthesized images enable downstream segmentation comparable to real images.
Reduces directional bias with isotropic receptive fields.
Abstract
Multi-modal medical image synthesis is pivotal for alleviating clinical data scarcity, yet existing methods fail to reconcile global anatomical consistency with high-fidelity local detail. We propose FermatSyn, which addresses three persistent limitations: (1)~a SAM2-based Prior Encoder that injects domain-aware anatomical knowledge via Lo-RA efficient fine-tuning of a frozen SAM2 Vision Transformer; (2)~a Hierarchical Residual Downsampling Module (HRDM) coupled with a Cross-scale Integration Network (CIN) that preserves high-frequency lesion details and adaptively fuses global--local representations; and (3)~a continuity constrained Fermat Spiral Scanning strategy within a Bidirectional Fermat Scan Mamba (BFS-Mamba), constructing an approximately isotropic receptive field that substantially reduces the directional bias of raster or spiral serialization. Experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
