Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
Siddique Latif, Muhammad Usama, Mohammad Ibrahim Malik, and Bj\"orn W., Schuller

TL;DR
This paper explores how large language models like ChatGPT can assist in annotating speech emotional data, potentially improving speech emotion recognition by augmenting datasets and addressing data scarcity issues.
Contribution
It demonstrates the feasibility and benefits of using LLMs for speech emotion data annotation, including data augmentation techniques to enhance SER performance.
Findings
LLMs can effectively annotate speech emotion data in various scenarios.
Data augmentation with LLM-annotated samples improves SER accuracy.
Performance variability observed in single-shot and few-shot settings.
Abstract
Despite recent advancements in speech emotion recognition (SER) models, state-of-the-art deep learning (DL) approaches face the challenge of the limited availability of annotated data. Large language models (LLMs) have revolutionised our understanding of natural language, introducing emergent properties that broaden comprehension in language, speech, and vision. This paper examines the potential of LLMs to annotate abundant speech data, aiming to enhance the state-of-the-art in SER. We evaluate this capability across various settings using publicly available speech emotion classification datasets. Leveraging ChatGPT, we experimentally demonstrate the promising role of LLMs in speech emotion data annotation. Our evaluation encompasses single-shot and few-shots scenarios, revealing performance variability in SER. Notably, we achieve improved results through data augmentation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Emotion and Mood Recognition
