LSSED: a large-scale dataset and benchmark for speech emotion   recognition

Weiquan Fan; Xiangmin Xu; Xiaofen Xing; Weidong Chen; Dongyan Huang

arXiv:2102.01754·cs.SD·February 4, 2021·1 cites

LSSED: a large-scale dataset and benchmark for speech emotion recognition

Weiquan Fan, Xiangmin Xu, Xiaofen Xing, Weidong Chen, Dongyan Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces LSSED, a large-scale English speech emotion dataset with data from 820 subjects, along with pre-trained models to advance speech emotion recognition and related applications like mental health analysis.

Contribution

It provides a large-scale, real-world speech emotion dataset and pre-trained models to facilitate research and transfer learning in speech emotion recognition.

Findings

01

Large-scale dataset improves model performance

02

Pre-trained models enhance downstream task accuracy

03

Dataset and models promote research in emotion recognition

Abstract

Speech emotion recognition is a vital contributor to the next generation of human-computer interaction (HCI). However, current existing small-scale databases have limited the development of related research. In this paper, we present LSSED, a challenging large-scale english speech emotion dataset, which has data collected from 820 subjects to simulate real-world distribution. In addition, we release some pre-trained models based on LSSED, which can not only promote the development of speech emotion recognition, but can also be transferred to related downstream tasks such as mental health analysis where data is extremely difficult to collect. Finally, our experiments show the necessity of large-scale datasets and the effectiveness of pre-trained models. The dateset will be released on https://github.com/tobefans/LSSED.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tobefans/LSSED
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis · Speech and dialogue systems