Utterance partitioning for speaker recognition: an experimental review   and analysis with new findings under GMM-SVM framework

Nirmalya Sen; Md Sahidullah (MULTISPEECH); Hemant Patil (DA-IICT),; Shyamal Kumar das Mandal (IIT Kharagpur); Sreenivasa Krothapalli Rao (IIT; Kharagpur); Tapan Kumar Basu (IIT Kharagpur)

arXiv:2105.11728·cs.LG·May 26, 2021

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework

Nirmalya Sen, Md Sahidullah (MULTISPEECH), Hemant Patil (DA-IICT),, Shyamal Kumar das Mandal (IIT Kharagpur), Sreenivasa Krothapalli Rao (IIT, Kharagpur), Tapan Kumar Basu (IIT Kharagpur)

PDF

Open Access

TL;DR

This paper provides an experimental review of GMM-SVM based speaker recognition, focusing on utterance partitioning's role in handling duration variability, and compares it with GMM-UBM, revealing nuanced insights into its effectiveness.

Contribution

It offers a detailed analysis of utterance partitioning effects within GMM-SVM systems and clarifies its limitations and conditions for usefulness, contrasting with prior assumptions.

Findings

01

Utterance partitioning does not solve data imbalance in GMM-SVM.

02

Partitioning improves performance in certain duration conditions.

03

Parameter choices significantly affect recognition accuracy.

Abstract

The performance of speaker recognition system is highly dependent on the amount of speech used in enrollment and test. This work presents a detailed experimental review and analysis of the GMM-SVM based speaker recognition system in presence of duration variability. This article also reports a comparison of the performance of GMM-SVM classifier with its precursor technique Gaussian mixture model-universal background model (GMM-UBM) classifier in presence of duration variability. The goal of this research work is not to propose a new algorithm for improving speaker recognition performance in presence of duration variability. However, the main focus of this work is on utterance partitioning (UP), a commonly used strategy to compensate the duration variability issue. We have analysed in detailed the impact of training utterance partitioning in speaker recognition performance under GMM-SVM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing