Detecting Emotion Carriers by Combining Acoustic and Lexical   Representations

Sebastian P. Bayerl; Aniruddha Tammewar; Korbinian Riedhammer and; Giuseppe Riccardi

arXiv:2112.06603·cs.CL·December 14, 2021

Detecting Emotion Carriers by Combining Acoustic and Lexical Representations

Sebastian P. Bayerl, Aniruddha Tammewar, Korbinian Riedhammer and, Giuseppe Riccardi

PDF

Open Access

TL;DR

This paper proposes a method to detect emotion carriers in spoken narratives by combining acoustic and lexical representations using neural networks and fusion techniques, aiming to enhance emotional understanding in dialogue systems.

Contribution

It introduces a novel approach that leverages word-based acoustic and textual embeddings with fusion strategies to identify emotion carriers in spoken narratives.

Findings

01

Late fusion improves detection accuracy significantly.

02

ResNet-based acoustic embeddings enhance emotion carrier detection.

03

Combining acoustic and lexical features outperforms lexical-only methods.

Abstract

Personal narratives (PN) - spoken or written - are recollections of facts, people, events, and thoughts from one's own experience. Emotion recognition and sentiment analysis tasks are usually defined at the utterance or document level. However, in this work, we focus on Emotion Carriers (EC) defined as the segments (speech or text) that best explain the emotional state of the narrator ("loss of father", "made me choose"). Once extracted, such EC can provide a richer representation of the user state to improve natural language understanding and dialogue modeling. In previous work, it has been shown that EC can be identified using lexical features. However, spoken narratives should provide a richer description of the context and the users' emotional state. In this paper, we leverage word-based acoustic and textual embeddings as well as early and late fusion techniques for the detection of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Speech Recognition and Synthesis · Emotion and Mood Recognition