Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition
Pedro C. Neto, Rafael M. Mamede, Carolina Albuquerque, Tiago, Gon\c{c}alves, Ana F. Sequeira

TL;DR
This paper compares real and synthetic face recognition datasets using a massive attribute classifier to analyze distribution differences, highlighting challenges in synthetic data's ability to replicate real data for training models.
Contribution
It introduces a large attribute annotation framework to study distribution drift between real and synthetic datasets in face recognition.
Findings
Differences exist between real and synthetic datasets based on attribute distributions.
Real datasets can explain synthetic data distributions, but not vice versa.
Synthetic data still lags behind real data in matching distribution characteristics.
Abstract
Face recognition applications have grown in parallel with the size of datasets, complexity of deep learning models and computational power. However, while deep learning models evolve to become more capable and computational power keeps increasing, the datasets available are being retracted and removed from public access. Privacy and ethical concerns are relevant topics within these domains. Through generative artificial intelligence, researchers have put efforts into the development of completely synthetic datasets that can be used to train face recognition systems. Nonetheless, the recent advances have not been sufficient to achieve performance comparable to the state-of-the-art models trained on real data. To study the drift between the performance of models trained on real and synthetic datasets, we leverage a massive attribute classifier (MAC) to create annotations for four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Image Processing and 3D Reconstruction
