Objective Human Affective Vocal Expression Detection and Automatic Classification with Stochastic Models and Learning Systems
V. Vieira, R. Coelho, F. Assis

TL;DR
This study introduces novel affective vocal features, HHHC and INS, which improve speech emotion classification accuracy across multiple languages and outperform existing features and classifiers.
Contribution
The paper proposes two new affective vocal features, HHHC and INS, and demonstrates their effectiveness in emotion classification, outperforming state-of-the-art features and classifiers.
Findings
HHHC significantly improves emotion classification accuracy.
The $eta$-GMM classifier outperforms other stochastic and machine learning classifiers.
HHHC and INS are effective as complementary features for existing feature sets.
Abstract
This paper presents a widespread analysis of affective vocal expression classification systems. In this study, state-of-the-art acoustic features are compared to two novel affective vocal prints for the detection of emotional states: the Hilbert-Huang-Hurst Coefficients (HHHC) and the vector of index of non-stationarity (INS). HHHC is here proposed as a nonlinear vocal source feature vector that represents the affective states according to their effects on the speech production mechanism. Emotional states are highlighted by the empirical mode decomposition (EMD) based method, which exploits the non-stationarity of the affective acoustic variations. Hurst coefficients (closely related to the excitation source) are then estimated from the decomposition process to compose the feature vector. Additionally, the INS vector is introduced as dynamic information to the HHHC feature. The proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Blind Source Separation Techniques
