deep learning of segment-level feature representation for speech emotion   recognition in conversations

Jiachen Luo; Huy Phan; Joshua Reiss

arXiv:2302.02419·cs.CL·February 7, 2023·1 cites

deep learning of segment-level feature representation for speech emotion recognition in conversations

Jiachen Luo, Huy Phan, Joshua Reiss

PDF

Open Access

TL;DR

This paper introduces a novel conversational speech emotion recognition approach that leverages segment-based audio features and attentive bi-directional GRUs to effectively model contextual and speaker-dependent emotional cues in dialogues.

Contribution

It proposes a new method combining pretrained VGGish features with attentive bi-directional GRUs for improved emotion recognition in conversations.

Findings

01

Outperforms state-of-the-art methods on MELD dataset.

02

Effectively captures contextual and speaker-sensitive emotional information.

03

Demonstrates robustness in dynamic conversational settings.

Abstract

Accurately detecting emotions in conversation is a necessary yet challenging task due to the complexity of emotions and dynamics in dialogues. The emotional state of a speaker can be influenced by many different factors, such as interlocutor stimulus, dialogue scene, and topic. In this work, we propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions. First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances. Second, an attentive bi-directional gated recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly in a dynamic manner. The experiments conducted on the standard conversational dataset MELD demonstrate the effectiveness of the proposed method when compared against state-of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition