PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Mahesh Bhosale; Abdul Wasi; Yuanhao Zhai; Yunjie Tian; Samuel Border; Nan Xi; Pinaki Sarder; Junsong Yuan; David Doermann; Xuan Gong

arXiv:2506.23440·cs.CV·July 1, 2025

PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Mahesh Bhosale, Abdul Wasi, Yuanhao Zhai, Yunjie Tian, Samuel Border, Nan Xi, Pinaki Sarder, Junsong Yuan, David Doermann, Xuan Gong

PDF

TL;DR

PathDiff is a diffusion-based model that synthesizes histopathology images using unpaired text and mask data, enabling detailed control over image semantics and structure for improved data augmentation.

Contribution

It introduces a novel diffusion framework that learns from unpaired mask-text data, enhancing control and quality in histopathology image synthesis.

Findings

01

Outperforms existing methods in image quality and alignment

02

Improves data augmentation for nuclei segmentation and classification

03

Enhances control over structural and semantic features

Abstract

Diffusion-based generative models have shown promise in synthesizing histopathology images to address data scarcity caused by privacy constraints. Diagnostic text reports provide high-level semantic descriptions, and masks offer fine-grained spatial structures essential for representing distinct morphological regions. However, public datasets lack paired text and mask data for the same histopathological images, limiting their joint use in image generation. This constraint restricts the ability to fully exploit the benefits of combining both modalities for enhanced control over semantics and spatial details. To overcome this, we propose PathDiff, a diffusion framework that effectively learns from unpaired mask-text data by integrating both modalities into a unified conditioning space. PathDiff allows precise control over structural and contextual features, generating high-quality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.