Enhancing Domain Diversity in Synthetic Data Face Recognition with Dataset Fusion
Anjith George, Sebastien Marcel

TL;DR
This paper proposes combining synthetic face datasets generated by different models to improve diversity and reduce artifacts, leading to better face recognition performance without using real-world data.
Contribution
It introduces a dataset fusion method using diverse synthetic data generators to enhance domain diversity and model robustness in face recognition.
Findings
Improved recognition accuracy on standard benchmarks.
Reduced model-specific artifacts and biases.
Enhanced diversity in pose, lighting, and demographics.
Abstract
While the accuracy of face recognition systems has improved significantly in recent years, the datasets used to train these models are often collected through web crawling without the explicit consent of users, raising ethical and privacy concerns. To address this, many recent approaches have explored the use of synthetic data for training face recognition models. However, these models typically underperform compared to those trained on real-world data. A common limitation is that a single generator model is often used to create the entire synthetic dataset, leading to model-specific artifacts that may cause overfitting to the generator's inherent biases and artifacts. In this work, we propose a solution by combining two state-of-the-art synthetic face datasets generated using architecturally distinct backbones. This fusion reduces model-specific artifacts, enhances diversity in pose,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis
