Synthetic medical data generation: state of the art and application to trauma mechanism classification
Oc\'eane Doremus, Ariel Guerra-Adames, Marta Avalos-Fernandez, Vianney Jouhet, C\'edric Gil-Jardin\'e, Emmanuel Lagarde

TL;DR
This paper reviews current machine learning techniques for creating synthetic medical data, emphasizing their use in trauma mechanism classification, and proposes a method for generating high-quality combined tabular and textual synthetic records.
Contribution
It introduces a novel methodology for generating high-quality synthetic medical records that integrate tabular and unstructured text data for trauma classification.
Findings
Review of state-of-the-art data generation methods
Proposed method for high-quality synthetic medical records
Application to trauma mechanism classification
Abstract
Faced with the challenges of patient confidentiality and scientific reproducibility, research on machine learning for health is turning towards the conception of synthetic medical databases. This article presents a brief overview of state-of-the-art machine learning methods for generating synthetic tabular and textual data, focusing their application to the automatic classification of trauma mechanisms, followed by our proposed methodology for generating high-quality, synthetic medical records combining tabular and unstructured text data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
