MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification
Ladislav Mo\v{s}ner, Old\v{r}ich Plchot, Luk\'a\v{s} Burget, Jan, \v{C}ernock\'y

TL;DR
This paper introduces MultiSV, a comprehensive multi-channel speaker verification dataset created through data simulation, enabling improved training and evaluation of multi-channel speaker verification systems in complex environments.
Contribution
The paper presents the MultiSV dataset, a new benchmark for multi-channel speaker verification, along with detailed data creation recipes and baseline system results.
Findings
MultiSV dataset facilitates training and evaluation in complex acoustic environments.
Baseline systems using neural network-based beamforming show promising results.
The dataset supports experiments with dereverberation, denoising, and speech enhancement.
Abstract
Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems. It can be readily used also for experiments with dereverberation, denoising, and speech enhancement. We tackled the ever-present problem of the lack of multi-channel training data by utilizing data simulation on top of clean parts of the Voxceleb dataset. The development and evaluation trials are based on a retransmitted Voices Obscured in Complex Environmental Settings (VOiCES) corpus, which we modified to provide multi-channel trials. We publish full recipes that create the dataset from public sources as the MultiSV corpus, and we provide results with two of our multi-channel speaker verification systems with neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
