MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

Ladislav Mo\v{s}ner; Old\v{r}ich Plchot; Luk\'a\v{s} Burget; Jan; \v{C}ernock\'y

arXiv:2111.06458·eess.AS·November 15, 2021

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification

Ladislav Mo\v{s}ner, Old\v{r}ich Plchot, Luk\'a\v{s} Burget, Jan, \v{C}ernock\'y

PDF

Open Access 1 Repo

TL;DR

This paper introduces MultiSV, a comprehensive multi-channel speaker verification dataset created through data simulation, enabling improved training and evaluation of multi-channel speaker verification systems in complex environments.

Contribution

The paper presents the MultiSV dataset, a new benchmark for multi-channel speaker verification, along with detailed data creation recipes and baseline system results.

Findings

01

MultiSV dataset facilitates training and evaluation in complex acoustic environments.

02

Baseline systems using neural network-based beamforming show promising results.

03

The dataset supports experiments with dereverberation, denoising, and speech enhancement.

Abstract

Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems. It can be readily used also for experiments with dereverberation, denoising, and speech enhancement. We tackled the ever-present problem of the lack of multi-channel training data by utilizing data simulation on top of clean parts of the Voxceleb dataset. The development and evaluation trials are based on a retransmitted Voices Obscured in Complex Environmental Settings (VOiCES) corpus, which we modified to provide multi-channel trials. We publish full recipes that create the dataset from public sources as the MultiSV corpus, and we provide results with two of our multi-channel speaker verification systems with neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lamomal/multisv
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing