Learning Spontaneity to Improve Emotion Recognition In Speech

Karttikeya Mangalam; Tanaya Guha

arXiv:1712.04753·eess.AS·June 15, 2018

Learning Spontaneity to Improve Emotion Recognition In Speech

Karttikeya Mangalam, Tanaya Guha

PDF

TL;DR

This paper explores how detecting spontaneity in speech can enhance emotion recognition accuracy, proposing models that jointly learn spontaneity and emotion, leading to state-of-the-art results on the IEMOCAP dataset.

Contribution

It introduces a novel approach using spontaneity detection as an auxiliary task to improve speech emotion recognition, achieving state-of-the-art performance.

Findings

01

Spontaneity detection improves emotion recognition accuracy.

02

Hierarchical and multitask models outperform baselines.

03

Achieved 69.1% accuracy on IEMOCAP for 4-class emotion recognition.

Abstract

We investigate the effect and usefulness of spontaneity (i.e. whether a given speech is spontaneous or not) in speech in the context of emotion recognition. We hypothesize that emotional content in speech is interrelated with its spontaneity, and use spontaneity classification as an auxiliary task to the problem of emotion recognition. We propose two supervised learning settings that utilize spontaneity to improve speech emotion recognition: a hierarchical model that performs spontaneity detection before performing emotion recognition, and a multitask learning model that jointly learns to recognize both spontaneity and emotion. Through various experiments on the well known IEMOCAP database, we show that by using spontaneity detection as an additional task, significant improvement can be achieved over emotion recognition systems that are unaware of spontaneity. We achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.