SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes
Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon, Papadopoulos

TL;DR
This paper introduces SDFD, a synthetic face image dataset with diverse attributes created using a prompt-based text-to-image model, enhancing the robustness of face analysis systems by capturing broader facial diversity.
Contribution
The work presents a novel prompt formulation strategy for generating a comprehensive synthetic face dataset with diverse attributes, expanding beyond traditional demographic focus.
Findings
The dataset is more challenging for classification tasks than existing datasets.
Synthetic images are high-quality and realistic.
The dataset is smaller but more diverse.
Abstract
AI systems rely on extensive training on large datasets to address various tasks. However, image-based systems, particularly those used for demographic attribute prediction, face significant challenges. Many current face image datasets primarily focus on demographic factors such as age, gender, and skin tone, overlooking other crucial facial attributes like hairstyle and accessories. This narrow focus limits the diversity of the data and consequently the robustness of AI systems trained on them. This work aims to address this limitation by proposing a methodology for generating synthetic face image datasets that capture a broader spectrum of facial diversity. Specifically, our approach integrates a systematic prompt formulation strategy, encompassing not only demographics and biometrics but also non-permanent traits like make-up, hairstyle, and accessories. These prompts guide a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis
MethodsSparse Evolutionary Training · Focus
