The DKU System for the Speaker Recognition Task of the 2019 VOiCES from   a Distance Challenge

Danwei Cai; Xiaoyi Qin; Weicheng Cai; Ming Li

arXiv:1907.02194·eess.AS·July 5, 2019·Interspeech·1 cites

The DKU System for the Speaker Recognition Task of the 2019 VOiCES from a Distance Challenge

Danwei Cai, Xiaoyi Qin, Weicheng Cai, Ming Li

PDF

Open Access

TL;DR

This paper introduces the DKU system for far-field speaker recognition in the 2019 VOiCES challenge, utilizing neural networks and score normalization to improve verification accuracy.

Contribution

The paper presents a comprehensive pipeline for far-field speaker verification, including a residual neural network with angular softmax loss and score normalization techniques, achieving competitive results.

Findings

01

Achieved 0.3668 minDCF and 5.58% EER with the best single system.

02

Improved performance using weighted prediction error algorithms.

03

Primary system obtained 0.3532 minDCF and 4.96% EER on evaluation set.

Abstract

In this paper, we present the DKU system for the speaker recognition task of the VOiCES from a distance challenge 2019. We investigate the whole system pipeline for the far-field speaker verification, including data pre-processing, short-term spectral feature representation, utterance-level speaker modeling, back-end scoring, and score normalization. Our best single system employs a residual neural network trained with angular softmax loss. Also, the weighted prediction error algorithms can further improve performance. It achieves 0.3668 minDCF and 5.58% EER on the evaluation set by using a simple cosine similarity scoring. Finally, the submitted primary system obtains 0.3532 minDCF and 4.96% EER on the evaluation set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing