An ELIXIR scoping review on domain-specific evaluation metrics for synthetic data in life sciences
Styliani-Christina Fragkouli, Somya Iqbal, Lisa Crossman, Barbara Gravel, Nagat Masued, Mark Onders, Devesh Haseja, Alex Stikkelman, Alfonso Valencia, Tom Lenaerts, Fotis Psomopoulos, Pilib \'O Broin, N\'uria Queralt-Rosinach, and Davide Cirillo

TL;DR
This paper reviews current evaluation metrics for synthetic data in life sciences, highlighting the need for standardized, robust methods to improve trust, validation, and application across various domains.
Contribution
It systematically analyzes existing practices for assessing synthetic data in life sciences and emphasizes the necessity for standardized evaluation metrics.
Findings
Evaluation practices are often inconsistent across domains.
Systematic evaluation of synthetic data is frequently overlooked.
Standardized metrics are urgently needed for better validation.
Abstract
Synthetic data has emerged as a powerful resource in life sciences, offering solutions for data scarcity, privacy protection and accessibility constraints. By creating artificial datasets that mirror the characteristics of real data, allows researchers to develop and validate computational methods in controlled environments. Despite its promise, the adoption of synthetic data in Life Sciences hinges on rigorous evaluation metrics designed to assess their fidelity and reliability. To explore the current landscape of synthetic data evaluation metrics in several Life Sciences domains, the ELIXIR Machine Learning Focus Group performed a systematic review of the scientific literature following the PRISMA guidelines. Six critical domains were examined to identify current practices for assessing synthetic data. Findings reveal that, while generation methods are rapidly evolving, systematic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
