VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

Joon Son Chung; Arsha Nagrani; Ernesto Coto; Weidi Xie; Mitchell; McLaren; Douglas A Reynolds; Andrew Zisserman

arXiv:1912.02522·cs.SD·December 6, 2019·48 cites

VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

Joon Son Chung, Arsha Nagrani, Ernesto Coto, Weidi Xie, Mitchell, McLaren, Douglas A Reynolds, Andrew Zisserman

PDF

Open Access

TL;DR

The VoxSRC 2019 challenge evaluated speaker recognition systems on unconstrained YouTube data, providing datasets, benchmarks, and fostering advancements in real-world speaker identification technology.

Contribution

First comprehensive challenge dataset and evaluation framework for speaker recognition in unconstrained 'in the wild' conditions.

Findings

01

Baseline systems established for the challenge

02

Performance metrics and results shared for comparison

03

Discussion on challenges and future directions in speaker recognition

Abstract

The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well current speaker recognition technology is able to identify speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a public challenge and workshop held at Interspeech 2019 in Graz, Austria. This paper outlines the challenge and provides its baselines, results and discussions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing