Zero-shot generation of synthetic neurosurgical data with large language models
Austin A. Barr, Eddie Guo, Emre Sezgin

TL;DR
This study demonstrates that GPT-4o can generate high-fidelity synthetic neurosurgical data in a zero-shot manner, effectively augmenting real data for machine learning tasks while maintaining privacy and data utility.
Contribution
It introduces the novel use of GPT-4o for zero-shot synthetic data generation in neurosurgery, outperforming traditional generative models without fine-tuning.
Findings
GPT-4o datasets matched or exceeded CTGAN performance
Synthetic data maintained high fidelity to real data
ML classifiers trained on synthetic data achieved comparable results
Abstract
Clinical data is fundamental to advance neurosurgical research, but access is often constrained by data availability, small sample sizes, privacy regulations, and resource-intensive preprocessing and de-identification procedures. Synthetic data offers a potential solution to challenges associated with accessing and using real-world data (RWD). This study aims to evaluate the capability of zero-shot generation of synthetic neurosurgical data with a large language model (LLM), GPT-4o, by benchmarking with the conditional tabular generative adversarial network (CTGAN). Synthetic datasets were compared to real-world neurosurgical data to assess fidelity (means, proportions, distributions, and bivariate correlations), utility (ML classifier performance on RWD), and privacy (duplication of records from RWD). The GPT-4o-generated datasets matched or exceeded CTGAN performance, despite no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment
