MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

Yandong Guo; Lei Zhang; Yuxiao Hu; Xiaodong He; Jianfeng Gao

arXiv:1607.08221·cs.CV·July 28, 2016·162 cites

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao

PDF

Open Access 5 Repos 2 Models

TL;DR

This paper introduces a large-scale face recognition benchmark with a dataset of 10 million images for recognizing one million celebrities, facilitating research in large-scale face recognition and related applications.

Contribution

It provides the first large-scale benchmark dataset and evaluation protocol for recognizing one million celebrities, advancing the scale and scope of face recognition research.

Findings

01

Baseline results demonstrate the feasibility of large-scale face recognition.

02

The dataset enables development of more accurate and scalable recognition models.

03

The benchmark supports disambiguation and real-world application testing.

Abstract

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition · Advanced Image and Video Retrieval Techniques