The DKU System for the Speaker Recognition Task of the 2019 VOiCES from a Distance Challenge
Danwei Cai, Xiaoyi Qin, Weicheng Cai, Ming Li

TL;DR
This paper introduces the DKU system for far-field speaker recognition in the 2019 VOiCES challenge, utilizing neural networks and score normalization to improve verification accuracy.
Contribution
The paper presents a comprehensive pipeline for far-field speaker verification, including a residual neural network with angular softmax loss and score normalization techniques, achieving competitive results.
Findings
Achieved 0.3668 minDCF and 5.58% EER with the best single system.
Improved performance using weighted prediction error algorithms.
Primary system obtained 0.3532 minDCF and 4.96% EER on evaluation set.
Abstract
In this paper, we present the DKU system for the speaker recognition task of the VOiCES from a distance challenge 2019. We investigate the whole system pipeline for the far-field speaker verification, including data pre-processing, short-term spectral feature representation, utterance-level speaker modeling, back-end scoring, and score normalization. Our best single system employs a residual neural network trained with angular softmax loss. Also, the weighted prediction error algorithms can further improve performance. It achieves 0.3668 minDCF and 5.58% EER on the evaluation set by using a simple cosine similarity scoring. Finally, the submitted primary system obtains 0.3532 minDCF and 4.96% EER on the evaluation set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
