On the use of automatically generated synthetic image datasets for   benchmarking face recognition

Laurent Colbois; Tiago de Freitas Pereira; S\'ebastien Marcel

arXiv:2106.04215·cs.CV·October 4, 2021

On the use of automatically generated synthetic image datasets for benchmarking face recognition

Laurent Colbois, Tiago de Freitas Pereira, S\'ebastien Marcel

PDF

1 Repo

TL;DR

This paper explores using GAN-generated synthetic face datasets for benchmarking face recognition systems, demonstrating that synthetic data can effectively replace real datasets in evaluation tasks.

Contribution

The study introduces a method to generate synthetic face datasets with controlled variations and validates their effectiveness for benchmarking face recognition systems.

Findings

01

Synthetic identities are distinct from GAN training data.

02

Benchmarking on synthetic datasets yields similar error rates to real datasets.

03

Synthetic datasets can reliably substitute real data for face recognition evaluation.

Abstract

The availability of large-scale face datasets has been key in the progress of face recognition. However, due to licensing issues or copyright infringement, some datasets are not available anymore (e.g. MS-Celeb-1M). Recent advances in Generative Adversarial Networks (GANs), to synthesize realistic face images, provide a pathway to replace real datasets by synthetic datasets, both to train and benchmark face recognition (FR) systems. The work presented in this paper provides a study on benchmarking FR systems using a synthetic dataset. First, we introduce the proposed methodology to generate a synthetic dataset, without the need for human intervention, by exploiting the latent structure of a StyleGAN2 model with multiple controlled factors of variation. Then, we confirm that (i) the generated synthetic identities are not data subjects from the GAN's training dataset, which is verified on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.idiap.ch/bob/bob.paper.ijcb2021_synthetic_dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsHuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Path Length Regularization · Convolution · Weight Demodulation