EmoGator: A New Open Source Vocal Burst Dataset with Baseline Machine Learning Classification Methodologies
Fred W. Buhl

TL;DR
EmoGator is a large, open-source dataset of vocal bursts with 32,130 samples across 30 emotion categories, enabling improved emotion recognition research and classifier development.
Contribution
The paper introduces EmoGator, a new extensive dataset of vocal bursts with emotion labels, and discusses baseline classification methodologies for emotion recognition.
Findings
Dataset contains 32,130 samples from 357 speakers.
Multiple classifier approaches are evaluated for emotion recognition.
Dataset is publicly available for research use.
Abstract
Vocal Bursts -- short, non-speech vocalizations that convey emotions, such as laughter, cries, sighs, moans, and groans -- are an often-overlooked aspect of speech emotion recognition, but an important aspect of human vocal communication. One barrier to study of these interesting vocalizations is a lack of large datasets. I am pleased to introduce the EmoGator dataset, which consists of 32,130 samples from 357 speakers, 16.9654 hours of audio; each sample classified into one of 30 distinct emotion categories by the speaker. Several different approaches to construct classifiers to identify emotion categories will be discussed, and directions for future research will be suggested. Data set is available for download from https://github.com/fredbuhl/EmoGator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Speech and Audio Processing
MethodsAttention Model
