Objective Human Affective Vocal Expression Detection and Automatic   Classification with Stochastic Models and Learning Systems

V. Vieira; R. Coelho; F. Assis

arXiv:1910.01967·eess.AS·October 7, 2019

Objective Human Affective Vocal Expression Detection and Automatic Classification with Stochastic Models and Learning Systems

V. Vieira, R. Coelho, F. Assis

PDF

Open Access

TL;DR

This study introduces novel affective vocal features, HHHC and INS, which improve speech emotion classification accuracy across multiple languages and outperform existing features and classifiers.

Contribution

The paper proposes two new affective vocal features, HHHC and INS, and demonstrates their effectiveness in emotion classification, outperforming state-of-the-art features and classifiers.

Findings

01

HHHC significantly improves emotion classification accuracy.

02

The $eta$-GMM classifier outperforms other stochastic and machine learning classifiers.

03

HHHC and INS are effective as complementary features for existing feature sets.

Abstract

This paper presents a widespread analysis of affective vocal expression classification systems. In this study, state-of-the-art acoustic features are compared to two novel affective vocal prints for the detection of emotional states: the Hilbert-Huang-Hurst Coefficients (HHHC) and the vector of index of non-stationarity (INS). HHHC is here proposed as a nonlinear vocal source feature vector that represents the affective states according to their effects on the speech production mechanism. Emotional states are highlighted by the empirical mode decomposition (EMD) based method, which exploits the non-stationarity of the affective acoustic variations. Hurst coefficients (closely related to the excitation source) are then estimated from the decomposition process to compose the feature vector. Additionally, the INS vector is introduced as dynamic information to the HHHC feature. The proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Blind Source Separation Techniques