SynDocDis: A Metadata-Driven Framework for Generating Synthetic Physician Discussions Using Large Language Models
Beny Rubinstein, Sergio Matos

TL;DR
SynDocDis is a new framework that uses structured prompts and de-identified metadata to generate realistic physician discussions, addressing privacy concerns and filling a gap in synthetic medical dialogue generation.
Contribution
It introduces a novel approach combining structured prompting with privacy-preserving metadata to synthesize physician-to-physician dialogues for medical AI applications.
Findings
Physicians rated communication effectiveness at 4.4/5.
Medical content quality scored 4.1/5.
Achieved 91% clinical relevance with strong privacy preservation.
Abstract
Physician-physician discussions of patient cases represent a rich source of clinical knowledge and reasoning that could feed AI agents to enrich and even participate in subsequent interactions. However, privacy regulations and ethical considerations severely restrict access to such data. While synthetic data generation using Large Language Models offers a promising alternative, existing approaches primarily focus on patient-physician interactions or structured medical records, leaving a significant gap in physician-to-physician communication synthesis. We present SynDocDis, a novel framework that combines structured prompting techniques with privacy-preserving de-identified case metadata to generate clinically accurate physician-to-physician dialogues. Evaluation by five practicing physicians in nine oncology and hepatology scenarios demonstrated exceptional communication effectiveness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
