X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models
Emmanuelle Bourigault, Abdullah Hamdi, Amir Jamaludin

TL;DR
X-Diffusion introduces a cross-sectional diffusion model that reconstructs detailed 3D MRI volumes from minimal 2D input slices, outperforming existing methods and generalizing across different body parts, thereby potentially reducing MRI scan times and costs.
Contribution
It is the first approach to generate detailed 3D MRI volumes from extremely sparse 2D slices using a holistic cross-sectional diffusion model.
Findings
Surpasses state-of-the-art in PSNR accuracy on unseen data.
Preserves critical anatomical features such as tumors and brain structures.
Successfully generalizes to knee MRIs despite training only on brain data.
Abstract
Magnetic Resonance Imaging (MRI) is a crucial diagnostic tool, but high-resolution scans are often slow and expensive due to extensive data acquisition requirements. Traditional MRI reconstruction methods aim to expedite this process by filling in missing frequency components in the K-space, performing 3D-to-3D reconstructions that demand full 3D scans. In contrast, we introduce X-Diffusion, a novel cross-sectional diffusion model that reconstructs detailed 3D MRI volumes from extremely sparse spatial-domain inputs, achieving 2D-to-3D reconstruction from as little as a single 2D MRI slice or few slices. A key aspect of X-Diffusion is that it models MRI data as holistic 3D volumes during the cross-sectional training and inference, unlike previous learning approaches that treat MRI scans as collections of 2D slices in standard planes (coronal, axial, sagittal). We evaluated X-Diffusion on…
Peer Reviews
Decision·Submitted to ICLR 2025
The authors demonstrate superior performance in several benchmarking tasks, including brain tumor and full body MRIs. The authors claim that the proposed model is able to generalize to novel domains, not seen during the training. The problem, motivation and contributions are clearly stated. Thorough experiments were conducted, methodically supporting the claims in paper. The idea of conditioning MRI reconstruction on different cross-sectional viewpoints seems novel.
There are no measurements on how fast the X-diffusion is. It would be beneficial to include comparisons with previous approaches. E.g., Figure 3 shows that for good quality reconstruction, almost T=1000 steps are required. The limitations discussed in Sections 6 and 7 appear to be strong, potentially limiting the practical value of the work. The authors dismiss the study of whether it is possible to reduce the number T in diffusion without sacrificing the quality? And how would this reductio
Originality: While the proposed method is built upon recent advantages in computer vision, its combination of key components and area of application is novel. Especially the cross-sectional MRI synthesis approach, which enables the stacking of slices from arbitrary viewing directions, not just axial/coronal and sagittal planes, is of particular interest to the medical imaging community and constitutes - to the best of my knowledge - a unique contribution of the paper. Quality: The paper is writ
1) Clinical relevance of the presented method: While the clinicians in this paper argue that the model is clinically relevant - I have a conflicting opinion. While "scout" or "localizer" scans with a single or a few slices may be acquired and thus used for X-Diffusion, I believe it is dangerous to assume that these will substitute a real volumetric scan. While the authors argue that tumor information is preserved, I would rather say this information is correctly hallucinated. Especially small
- Both the approach and the application appear quite novel. - Many experiments were performed.
- The experiments are confusing. It is as if the authors tried all they could think of and let the readers do what they can with the results. - A major point that is often unclear in the experiments/results is what slice(s) was/were used as input. If the authors show in Figure VI that is has a impact on the results, it is not specified in other sections, e.g. 5.1, 5.2, 5.3, 5.4. - The way the authors split the BraTS dataset is not clear. On the one hand, they state that the dataset includes 5,8
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMRI in cancer diagnosis · Radiomics and Machine Learning in Medical Imaging · Medical Image Segmentation Techniques
MethodsDiffusion
