ProGiDiff: Prompt-Guided Diffusion-Based Medical Image Segmentation

Yuan Lin; Murong Xu; Marc H\"olle; Chinmay Prabhakar; Andreas Maier; Vasileios Belagiannis; Bjoern Menze; Suprosanna Shit

arXiv:2601.16060·cs.CV·January 23, 2026

ProGiDiff: Prompt-Guided Diffusion-Based Medical Image Segmentation

Yuan Lin, Murong Xu, Marc H\"olle, Chinmay Prabhakar, Andreas Maier, Vasileios Belagiannis, Bjoern Menze, Suprosanna Shit

PDF

Open Access

TL;DR

ProGiDiff introduces a prompt-guided diffusion framework for medical image segmentation that leverages pre-trained models, enabling multi-class segmentation, human interaction, and cross-modality adaptation with strong experimental results.

Contribution

The paper presents a novel conditioning mechanism for diffusion models, allowing effective medical image segmentation guided by natural language prompts, adaptable across modalities with minimal training.

Findings

01

Outperforms previous segmentation methods on CT images

02

Supports multi-class segmentation via natural language prompts

03

Enables cross-modality adaptation with few-shot learning

Abstract

Widely adopted medical image segmentation methods, although efficient, are primarily deterministic and remain poorly amenable to natural language prompts. Thus, they lack the capability to estimate multiple proposals, human interaction, and cross-modality adaptation. Recently, text-to-image diffusion models have shown potential to bridge the gap. However, training them from scratch requires a large dataset-a limitation for medical image segmentation. Furthermore, they are often limited to binary segmentation and cannot be conditioned on a natural language prompt. To this end, we propose a novel framework called ProGiDiff that leverages existing image generation models for medical image segmentation purposes. Specifically, we propose a ControlNet-style conditioning mechanism with a custom encoder, suitable for image conditioning, to steer a pre-trained diffusion model to output…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications