Novel Cascaded Gaussian Mixture Model-Deep Neural Network Classifier for Speaker Identification in Emotional Talking Environments
Ismail Shahin, Ali Bou Nassif, Shibani Hamsa

TL;DR
This paper introduces a novel cascaded Gaussian Mixture Model-Deep Neural Network classifier that significantly improves speaker identification accuracy in emotional talking environments, outperforming traditional classifiers across multiple speech datasets.
Contribution
The study proposes and evaluates a new cascaded GMM-DNN classifier specifically designed for speaker identification in emotional speech, demonstrating superior performance over classical methods.
Findings
Cascaded GMM-DNN outperforms MLP and SVM classifiers.
Improved speaker identification accuracy across emotional speech datasets.
Performance comparable to human subjective assessment.
Abstract
This research is an effort to present an effective approach to enhance text-independent speaker identification performance in emotional talking environments based on novel classifier called cascaded Gaussian Mixture Model-Deep Neural Network (GMM-DNN). Our current work focuses on proposing, implementing and evaluating a new approach for speaker identification in emotional talking environments based on cascaded Gaussian Mixture Model-Deep Neural Network as a classifier. The results point out that the cascaded GMM-DNN classifier improves speaker identification performance at various emotions using two distinct speech databases: Emirati speech database (Arabic United Arab Emirates dataset) and Speech Under Simulated and Actual Stress (SUSAS) English dataset. The proposed classifier outperforms classical classifiers such as Multilayer Perceptron (MLP) and Support Vector Machine (SVM) in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
