Multiscale Structure-Guided Latent Diffusion for Multimodal MRI Translation

Jianqiang Lin (1; 2); Zhiqiang Shen (1; 2); Peng Cao (1; 2; 3); Jinzhu Yang (1; 2; 3); Osmar R. Zaiane (4); Xiaoli Liu (5) ((1) Northeastern University; Shenyang; China; (2) Key Laboratory of Intelligent Computing in Medical Image; Shenyang; China; (3) National Frontiers Science Center for Industrial Intelligence; Systems Optimization; Shenyang; China; (4) University of Alberta; Edmonton; Canada; (5) AiShiWeiLai AI Research; Beijing; China)

arXiv:2603.12581·eess.IV·March 16, 2026

Multiscale Structure-Guided Latent Diffusion for Multimodal MRI Translation

Jianqiang Lin (1, 2), Zhiqiang Shen (1, 2), Peng Cao (1, 2, 3), Jinzhu Yang (1, 2, 3), Osmar R. Zaiane (4), Xiaoli Liu (5) ((1) Northeastern University, Shenyang, China, (2) Key Laboratory of Intelligent Computing in Medical Image, Shenyang, China

PDF

Open Access

TL;DR

This paper introduces MSG-LDM, a latent diffusion framework for multi-modal MRI translation that effectively preserves structural details and reduces inconsistencies by disentangling style and structure in the latent space.

Contribution

The proposed method uniquely combines style-structure disentanglement with multi-scale feature modeling to improve MRI translation quality, especially in missing-modality scenarios.

Findings

01

Outperforms existing MRI synthesis methods on BraTS2020 and WMH datasets.

02

Effectively preserves boundary details and anatomical structures.

03

Reduces modality interference through style and structure loss functions.

Abstract

Although diffusion models have achieved remarkable progress in multi-modal magnetic resonance imaging (MRI) translation tasks, existing methods still tend to suffer from anatomical inconsistencies or degraded texture details when handling arbitrary missing-modality scenarios. To address these issues, we propose a latent diffusion-based multi-modal MRI translation framework, termed MSG-LDM. By leveraging the available modalities, the proposed method infers complete structural information, which preserves reliable boundary details. Specifically, we introduce a style--structure disentanglement mechanism in the latent space, which explicitly separates modality-specific style features from shared structural representations, and jointly models low-frequency anatomical layouts and high-frequency boundary details in a multi-scale feature space. During the structure disentanglement stage,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications