CytoDiff: AI-Driven Cytomorphology Image Synthesis for Medical Diagnostics
Jan Carreras Boada, Rao Muhammad Umer, Carsten Marr

TL;DR
CytoDiff leverages a diffusion model fine-tuned with LoRA to generate high-quality synthetic white blood cell images, significantly improving classifier accuracy in privacy-constrained, imbalanced biomedical datasets.
Contribution
Introduces CytoDiff, a novel diffusion-based synthetic image generator for biomedical data, enhancing classification performance with limited and imbalanced datasets.
Findings
Synthetic images increased classifier accuracy from 27% to 78%.
Synthetic data improved CLIP classifier accuracy from 62% to 77%.
Demonstrates synthetic data as a valuable tool for privacy-preserving biomedical ML.
Abstract
Biomedical datasets are often constrained by stringent privacy requirements and frequently suffer from severe class imbalance. These two aspects hinder the development of accurate machine learning models. While generative AI offers a promising solution, producing synthetic images of sufficient quality for training robust classifiers remains challenging. This work addresses the classification of individual white blood cells, a critical task in diagnosing hematological malignancies such as acute myeloid leukemia (AML). We introduce CytoDiff, a stable diffusion model fine-tuned with LoRA weights and guided by few-shot samples that generates high-fidelity synthetic white blood cell images. Our approach demonstrates substantial improvements in classifier performance when training data is limited. Using a small, highly imbalanced real dataset, the addition of 5,000 synthetic images per class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · AI in cancer detection · Cell Image Analysis Techniques
MethodsDiffusion · Focus
