Analyzing the Influence of Dataset Composition for Emotion Recognition

A. Sutherland; S. Magg; C. Weber; S. Wermter

arXiv:2103.03700·cs.LG·March 8, 2021·1 cites

Analyzing the Influence of Dataset Composition for Emotion Recognition

A. Sutherland, S. Magg, C. Weber, S. Wermter

PDF

Open Access

TL;DR

This paper investigates how the method of data collection affects the composition and emotion recognition accuracy of two multimodal datasets, highlighting implications for human-robot interaction research.

Contribution

It provides an analysis of dataset composition effects on emotion recognition performance, emphasizing the importance of data collection methodology.

Findings

01

Dataset composition impacts generalization performance.

02

IEMOCAP dataset shows negative influence on accuracy.

03

Implications for human-robot interaction experiments.

Abstract

Recognizing emotions from text in multimodal architectures has yielded promising results, surpassing video and audio modalities under certain circumstances. However, the method by which multimodal data is collected can be significant for recognizing emotional features in language. In this paper, we address the influence data collection methodology has on two multimodal emotion recognition datasets, the IEMOCAP dataset and the OMG-Emotion Behavior dataset, by analyzing textual dataset compositions and emotion recognition accuracy. Experiments with the full IEMOCAP dataset indicate that the composition negatively influences generalization performance when compared to the OMG-Emotion Behavior dataset. We conclude by discussing the impact this may have on HRI experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Speech Recognition and Synthesis