Learning Fine-Grained Cross Modality Excitement for Speech Emotion   Recognition

Hang Li; Wenbiao Ding; Zhongqin Wu; Zitao Liu

arXiv:2010.12733·cs.SD·July 16, 2021

Learning Fine-Grained Cross Modality Excitement for Speech Emotion Recognition

Hang Li, Wenbiao Ding, Zhongqin Wu, Zitao Liu

PDF

1 Repo

TL;DR

This paper introduces a novel multimodal deep learning approach for fine-grained speech emotion recognition, utilizing a temporal alignment pooling and cross modality excitement modules to improve prediction accuracy on real-world datasets.

Contribution

It proposes a new model with a temporal alignment mean-max pooling and cross modality excitement modules for enhanced fine-grained emotion recognition from speech.

Findings

01

Outperforms baseline models in prediction accuracy

02

Effective in capturing subtle emotions in speech

03

Model components significantly improve results

Abstract

Speech emotion recognition is a challenging task because the emotion expression is complex, multimodal and fine-grained. In this paper, we propose a novel multimodal deep learning approach to perform fine-grained emotion recognition from real-life speeches. We design a temporal alignment mean-max pooling mechanism to capture the subtle and fine-grained emotions implied in every utterance. In addition, we propose a cross modality excitement module to conduct sample-specific adjustment on cross modality embeddings and adaptively recalibrate the corresponding values by its aligned latent features from the other modality. Our proposed model is evaluated on two well-known real-world speech emotion recognition datasets. The results demonstrate that our approach is superior on the prediction tasks for multimodal speech utterances, and it outperforms a wide range of baselines in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tal-ai/FG_CME
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.