Generation of Synthetic Clinical Text: A Systematic Review
Basel Alshaikhdeeb, Ahmed Abdelmonem Hemedan, Soumyabrata Ghosh, Irina Balaur, and Venkata Satagopam

TL;DR
This systematic review analyzes methods, purposes, and evaluation techniques for generating synthetic clinical text, highlighting transformer-based models' prominence, their utility in NLP tasks, and ongoing privacy concerns.
Contribution
It provides a comprehensive overview of recent advances in synthetic clinical text generation, emphasizing the dominant techniques, evaluation metrics, and practical applications.
Findings
Transformer architectures, especially GPTs, are predominant in synthetic text generation.
Synthetic medical text improves NLP task performance and data augmentation.
Privacy remains a challenge, requiring further assessment to prevent sensitive information leakage.
Abstract
Generating clinical synthetic text represents an effective solution for common clinical NLP issues like sparsity and privacy. This paper aims to conduct a systematic review on generating synthetic medical free-text by formulating quantitative analysis to three research questions concerning (i) the purpose of generation, (ii) the techniques, and (iii) the evaluation methods. We searched PubMed, ScienceDirect, Web of Science, Scopus, IEEE, Google Scholar, and arXiv databases for publications associated with generating synthetic medical unstructured free-text. We have identified 94 relevant articles out of 1,398 collected ones. A great deal of attention has been given to the generation of synthetic medical text from 2018 onwards, where the main purpose of such a generation is towards text augmentation, assistive writing, corpus building, privacy-preserving, annotation, and usefulness.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
