Can I trust my fake data -- A comprehensive quality assessment framework for synthetic tabular data in healthcare
Vibeke Binz Vallevik, Aleksandar Babic, Serena Elizabeth Marshall,, Severin Elvatun, Helga Br{\o}gger, Sharmini Alagaratnam, Bj{\o}rn Edwin,, Narasimha Raghavan Veeraragavan, Anne Kjersti Befring, Jan Franz Nyg{\aa}rd

TL;DR
This paper develops a comprehensive quality assessment framework for synthetic tabular healthcare data, addressing existing gaps by including fairness and carbon footprint metrics to ensure trustworthy AI applications.
Contribution
It introduces a new conceptual framework that aligns diverse quality metrics, expands quality dimensions, and benchmarks it with real healthcare data.
Findings
Framework includes fairness and carbon footprint metrics.
Benchmarking on Dutch National Cancer Registry data.
Focuses on transparency and safety in synthetic data use.
Abstract
Ensuring safe adoption of AI tools in healthcare hinges on access to sufficient data for training, testing and validation. In response to privacy concerns and regulatory requirements, using synthetic data has been suggested. Synthetic data is created by training a generator on real data to produce a dataset with similar statistical properties. Competing metrics with differing taxonomies for quality evaluation have been suggested, resulting in a complex landscape. Optimising quality entails balancing considerations that make the data fit for use, yet relevant dimensions are left out of existing frameworks. We performed a comprehensive literature review on the use of quality evaluation metrics on SD within the scope of tabular healthcare data and SD made using deep generative methods. Based on this and the collective team experiences, we developed a conceptual framework for quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Ethics in Clinical Research · Health Systems, Economic Evaluations, Quality of Life
MethodsFocus
