MedCondDiff: Lightweight, Robust, Semantically Guided Diffusion for Medical Image Segmentation
Ruirui Huang, Jiacheng Li

TL;DR
MedCondDiff is a lightweight, robust diffusion-based framework for multi-organ medical image segmentation that leverages semantic priors for improved performance and efficiency across various imaging modalities.
Contribution
Introduces MedCondDiff, a semantically guided diffusion model using a Pyramid Vision Transformer to enhance medical image segmentation efficiency and robustness.
Findings
Achieves competitive segmentation performance across multiple organs and modalities.
Reduces inference time and VRAM usage compared to traditional diffusion models.
Demonstrates robustness and efficiency in diverse medical imaging tasks.
Abstract
We introduce MedCondDiff, a diffusion-based framework for multi-organ medical image segmentation that is efficient and anatomically grounded. The model conditions the denoising process on semantic priors extracted by a Pyramid Vision Transformer (PVT) backbone, yielding a semantically guided and lightweight diffusion architecture. This design improves robustness while reducing both inference time and VRAM usage compared to conventional diffusion models. Experiments on multi-organ, multi-modality datasets demonstrate that MedCondDiff delivers competitive performance across anatomical regions and imaging modalities, underscoring the potential of semantically guided diffusion models as an effective class of architectures for medical imaging tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
