Beyond Real Faces: Synthetic Datasets Can Achieve Reliable Recognition Performance without Privacy Compromise
Pawe{\l} Borsukiewicz, Fadi Boutros, Iyiola E. Olatunji, Charles Beumier, Wendk\^uuni C. Ouedraogo, Jacques Klein, Tegawend\'e F. Bissyand\'e

TL;DR
Synthetic facial datasets can achieve recognition performance comparable to real data while preserving privacy, offering a viable and ethically preferable alternative for facial recognition systems.
Contribution
This study provides the first comprehensive empirical evaluation of synthetic facial datasets, demonstrating their effectiveness and ethical advantages in recognition tasks.
Findings
Best synthetic datasets achieve over 95% accuracy.
Synthetic data maintains intra-class variability and identity separability.
Synthetic datasets exhibit limited bias and allow for bias mitigation.
Abstract
The deployment of facial recognition systems has created an ethical dilemma: achieving high accuracy requires massive datasets of real faces collected without consent, leading to dataset retractions and potential legal liabilities under regulations like GDPR. While synthetic facial data presents a promising privacy-preserving alternative, the field lacks comprehensive empirical evidence of its viability. This study addresses this critical gap through extensive evaluation of synthetic facial recognition datasets. We present a systematic literature review identifying 25 synthetic facial recognition datasets (2018-2025), combined with rigorous experimental validation. Our methodology examines seven key requirements for privacy-preserving synthetic data: identity leakage prevention, intra-class variability, identity separability, dataset scale, ethical data sourcing, bias mitigation, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Biometric Identification and Security · Privacy-Preserving Technologies in Data
