MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge
Xin Jing, Jiadong Wang, Iosif Tsangko, Andreas Triantafyllopoulos, Bj\"orn W. Schuller

TL;DR
This paper introduces MELT, a novel approach that leverages GPT-4o's embedded knowledge to automatically annotate multimodal emotion datasets using only textual cues, reducing reliance on costly human annotation.
Contribution
The study demonstrates that GPT-4o can effectively annotate multimodal emotion data without multimodal inputs, enabling scalable and consistent dataset creation for speech emotion recognition.
Findings
GPT-4o-generated annotations improve SER performance.
MELT dataset enhances model training and evaluation.
Subjective experiments confirm annotation quality.
Abstract
Although speech emotion recognition (SER) has advanced significantly with deep learning, annotation remains a major hurdle. Human annotation is not only costly but also subject to inconsistencies annotators often have different preferences and may lack the necessary contextual knowledge, which can lead to varied and inaccurate labels. Meanwhile, Large Language Models (LLMs) have emerged as a scalable alternative for annotating text data. However, the potential of LLMs to perform emotional speech data annotation without human supervision has yet to be thoroughly investigated. To address these problems, we apply GPT-4o to annotate a multimodal dataset collected from the sitcom Friends, using only textual cues as inputs. By crafting structured text prompts, our methodology capitalizes on the knowledge GPT-4o has accumulated during its training, showcasing that it can generate accurate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Web Data Mining and Analysis
