A comparative study of several parameterizations for speaker recognition
Marcos Faundez-Zanuy

TL;DR
This paper systematically compares various parameterizations for speaker recognition, demonstrating that combining multiple parameterizations enhances robustness across different recording conditions, languages, and mismatch scenarios.
Contribution
It provides an exhaustive analysis of parameterizations and evaluates two methods, showing how their combination improves speaker recognition robustness.
Findings
Combining parameterizations improves robustness in speaker verification and identification.
Vector quantization and covariance matrices with sphericity measure are effective methods.
Robustness is maintained across different recording sessions, microphones, and languages.
Abstract
This paper presents an exhaustive study about the robustness of several parameterizations, in speaker verification and identification tasks. We have studied several mismatch conditions: different recording sessions, microphones, and different languages (it has been obtained from a bilingual set of speakers). This study reveals that the combination of several parameterizations can improve the robustness in all the scenarios for both tasks, identification and verification. In addition, two different methods have been evaluated: vector quantization, and covariance matrices with an arithmetic-harmonic sphericity measure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
