EmoGator: A New Open Source Vocal Burst Dataset with Baseline Machine   Learning Classification Methodologies

Fred W. Buhl

arXiv:2301.00508·cs.SD·April 7, 2023·1 cites

EmoGator: A New Open Source Vocal Burst Dataset with Baseline Machine Learning Classification Methodologies

Fred W. Buhl

PDF

Open Access 1 Repo 2 Datasets

TL;DR

EmoGator is a large, open-source dataset of vocal bursts with 32,130 samples across 30 emotion categories, enabling improved emotion recognition research and classifier development.

Contribution

The paper introduces EmoGator, a new extensive dataset of vocal bursts with emotion labels, and discusses baseline classification methodologies for emotion recognition.

Findings

01

Dataset contains 32,130 samples from 357 speakers.

02

Multiple classifier approaches are evaluated for emotion recognition.

03

Dataset is publicly available for research use.

Abstract

Vocal Bursts -- short, non-speech vocalizations that convey emotions, such as laughter, cries, sighs, moans, and groans -- are an often-overlooked aspect of speech emotion recognition, but an important aspect of human vocal communication. One barrier to study of these interesting vocalizations is a lack of large datasets. I am pleased to introduce the EmoGator dataset, which consists of 32,130 samples from 357 speakers, 16.9654 hours of audio; each sample classified into one of 30 distinct emotion categories by the speaker. Several different approaches to construct classifiers to identify emotion categories will be discussed, and directions for future research will be suggested. Data set is available for download from https://github.com/fredbuhl/EmoGator.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fredbuhl/emogator
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Speech and Audio Processing

MethodsAttention Model