EigenEmo: Spectral Utterance Representation Using Dynamic Mode   Decomposition for Speech Emotion Classification

Shuiyang Mao; P. C. Ching; Tan Lee

arXiv:2008.06665·eess.AS·August 18, 2020·1 cites

EigenEmo: Spectral Utterance Representation Using Dynamic Mode Decomposition for Speech Emotion Classification

Shuiyang Mao, P. C. Ching, Tan Lee

PDF

Open Access

TL;DR

EigenEmo introduces a novel spectral decomposition method using Dynamic Mode Decomposition to capture the intrinsic dynamics of emotional speech, improving emotion classification accuracy.

Contribution

This work applies Dynamic Mode Decomposition to emotion flow representations, providing a new spectral approach for speech emotion classification.

Findings

01

EigenEmo achieves promising classification results.

02

Concatenating EigenEmo features with simple averages improves performance.

03

The method captures fundamental transition dynamics of emotional speech.

Abstract

Human emotional speech is, by its very nature, a variant signal. This results in dynamics intrinsic to automatic emotion classification based on speech. In this work, we explore a spectral decomposition method stemming from fluid-dynamics, known as Dynamic Mode Decomposition (DMD), to computationally represent and analyze the global utterance-level dynamics of emotional speech. Specifically, segment-level emotion-specific representations are first learned through an Emotion Distillation process. This forms a multi-dimensional signal of emotion flow for each utterance, called Emotion Profiles (EPs). The DMD algorithm is then applied to the resultant EPs to capture the eigenfrequencies, and hence the fundamental transition dynamics of the emotion flow. Evaluation experiments using the proposed approach, which we call EigenEmo, show promising results. Moreover, due to the positive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Quantum, superfluid, helium dynamics · Speech Recognition and Synthesis