Performance Evaluation of Statistical Approaches for Text Independent Speaker Recognition Using Source Feature
R. Rajeswara Rao, V. Kamakshi Prasad, A. Nagesh

TL;DR
This study evaluates statistical methods like GMMs and HMMs for text-independent speaker recognition using source features, showing HMMs outperform GMMs in accuracy on the TIMIT database.
Contribution
It compares the effectiveness of GMM and ergodic HMM approaches in speaker recognition using excitation features, highlighting the superiority of HMMs.
Findings
HMMs achieve near 100% accuracy in speaker recognition.
Excitation features contain significant speaker-specific information.
HMMs outperform GMMs in recognition accuracy.
Abstract
This paper introduces the performance evaluation of statistical approaches for TextIndependent speaker recognition system using source feature. Linear prediction LP residual is used as a representation of excitation information in speech. The speaker-specific information in the excitation of voiced speech is captured using statistical approaches such as Gaussian Mixture Models GMMs and Hidden Markov Models HMMs. The decrease in the error during training and recognizing speakers during testing phase close to 100 percent accuracy demonstrates that the excitation component of speech contains speaker-specific information and is indeed being effectively captured by continuous Ergodic HMM than GMM. The performance of the speaker recognition system is evaluated on GMM and 2 state ergodic HMM with different mixture components and test speech duration. We demonstrate the speaker recognition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
