Towards 3D Semantic Image Synthesis for Medical Imaging

Wenwu Tang; Khaled Seyam; Bin Yang

arXiv:2507.00206·eess.IV·July 2, 2025

Towards 3D Semantic Image Synthesis for Medical Imaging

Wenwu Tang, Khaled Seyam, Bin Yang

PDF

TL;DR

This paper introduces Med-LSDM, a 3D semantic image synthesis model for medical imaging that generates high-quality volumetric data using a diffusion process in latent space, aiding privacy and data augmentation.

Contribution

The study presents Med-LSDM, a novel 3D semantic image synthesis method leveraging latent diffusion and a guiding mechanism, specifically designed for medical volumetric data.

Findings

01

Achieves a 3D-FID score of 0.0054 on Duke Breast dataset.

02

Produces synthetic images with Dice scores close to real data.

03

Demonstrates effective data augmentation for medical imaging.

Abstract

In the medical domain, acquiring large datasets is challenging due to both accessibility issues and stringent privacy regulations. Consequently, data availability and privacy protection are major obstacles to applying machine learning in medical imaging. To address this, our study proposes the Med-LSDM (Latent Semantic Diffusion Model), which operates directly in the 3D domain and leverages de-identified semantic maps to generate synthetic data as a method of privacy preservation and data augmentation. Unlike many existing methods that focus on generating 2D slices, Med-LSDM is designed specifically for 3D semantic image synthesis, making it well-suited for applications requiring full volumetric data. Med-LSDM incorporates a guiding mechanism that controls the 3D image generation process by applying a diffusion model within the latent space of a pre-trained VQ-GAN. By operating in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.