Emotional Speaker Identification using a Novel Capsule Nets Model

Ali Bou Nassif; Ismail Shahin; Ashraf Elnagar; Divya Velayudhan; Adi; Alhudhaif; Kemal Polat

arXiv:2201.02994·cs.SD·January 11, 2022

Emotional Speaker Identification using a Novel Capsule Nets Model

Ali Bou Nassif, Ismail Shahin, Ashraf Elnagar, Divya Velayudhan, Adi, Alhudhaif, Kemal Polat

PDF

TL;DR

This paper introduces a novel CapsNet-based model for emotional speaker identification, demonstrating faster training and improved accuracy over existing methods across multiple speech databases.

Contribution

The study proposes a new CapsNet architecture tailored for emotional speaker recognition, addressing CNN limitations in capturing spatial feature relationships.

Findings

01

CapsNet model trains faster than baseline models

02

CapsNet achieves higher accuracy in emotional speaker identification

03

Routing algorithm iterations impact performance significantly

Abstract

Speaker recognition systems are widely used in various applications to identify a person by their voice; however, the high degree of variability in speech signals makes this a challenging task. Dealing with emotional variations is very difficult because emotions alter the voice characteristics of a person; thus, the acoustic features differ from those used to train models in a neutral environment. Therefore, speaker recognition models trained on neutral speech fail to correctly identify speakers under emotional stress. Although considerable advancements in speaker identification have been made using convolutional neural networks (CNN), CNNs cannot exploit the spatial association between low-level features. Inspired by the recent introduction of capsule networks (CapsNets), which are based on deep learning to overcome the inadequacy of CNNs in preserving the pose relationship between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.