Synthetic Patient-Physician Dialogue Generation from Clinical Notes Using LLM
Trisha Das, Dina Albassam, Jimeng Sun

TL;DR
This paper introduces SynDial, a method using a single large language model with iterative zero-shot prompting and feedback to generate high-quality synthetic patient-physician dialogues from clinical notes, addressing privacy concerns.
Contribution
It presents a novel iterative LLM-based approach for synthetic dialogue generation that improves extractiveness and factuality, surpassing baseline methods.
Findings
Generated dialogues have higher extractiveness due to feedback loop.
Synthetic dialogues outperform baselines in factuality metrics.
Diversity scores are comparable to GPT-4.
Abstract
Medical dialogue systems (MDS) enhance patient-physician communication, improve healthcare accessibility, and reduce costs. However, acquiring suitable data to train these systems poses significant challenges. Privacy concerns prevent the use of real conversations, necessitating synthetic alternatives. Synthetic dialogue generation from publicly available clinical notes offers a promising solution to this issue, providing realistic data while safeguarding privacy. Our approach, SynDial, uses a single LLM iteratively with zero-shot prompting and a feedback loop to generate and refine high-quality synthetic dialogues. The feedback consists of weighted evaluation scores for similarity and extractiveness. The iterative process ensures dialogues meet predefined thresholds, achieving superior extractiveness as a result of the feedback loop. Additionally, evaluation shows that the generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
