Clinically-guided Data Synthesis for Laryngeal Lesion Detection

Chiara Baldini; Kaisar Kushibar; Richard Osuala; Simone Balocco; Oliver Diaz; Karim Lekadir; Leonardo S. Mattos

arXiv:2508.06182·eess.IV·August 11, 2025

Clinically-guided Data Synthesis for Laryngeal Lesion Detection

Chiara Baldini, Kaisar Kushibar, Richard Osuala, Simone Balocco, Oliver Diaz, Karim Lekadir, Leonardo S. Mattos

PDF

Open Access

TL;DR

This paper presents a novel clinical-guided data synthesis method using Latent Diffusion Models to generate realistic laryngeal endoscopic images, significantly enhancing training data for lesion detection systems.

Contribution

It introduces a clinically-guided image synthesis approach with diffusion models to address data scarcity in laryngeal lesion detection, improving model performance and realism.

Findings

01

Synthetic data increased detection rate by 9% internally

02

Synthetic data increased detection rate by 22.1% externally

03

Experts rated synthetic images as highly realistic

Abstract

Although computer-aided diagnosis (CADx) and detection (CADe) systems have made significant progress in various medical domains, their application is still limited in specialized fields such as otorhinolaryngology. In the latter, current assessment methods heavily depend on operator expertise, and the high heterogeneity of lesions complicates diagnosis, with biopsy persisting as the gold standard despite its substantial costs and risks. A critical bottleneck for specialized endoscopic CADx/e systems is the lack of well-annotated datasets with sufficient variability for real-world generalization. This study introduces a novel approach that exploits a Latent Diffusion Model (LDM) coupled with a ControlNet adapter to generate laryngeal endoscopic image-annotation pairs, guided by clinical observations. The method addresses data scarcity by conditioning the diffusion process to produce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Head and Neck Cancer Studies