Curriculum Learning for Speech Emotion Recognition from Crowdsourced   Labels

Reza Lotfian; Carlos Busso

arXiv:1805.10339·eess.AS·March 17, 2022

Curriculum Learning for Speech Emotion Recognition from Crowdsourced Labels

Reza Lotfian, Carlos Busso

PDF

TL;DR

This paper proposes a curriculum learning approach for speech emotion recognition that uses human annotation disagreement as a difficulty measure, leading to improved training efficiency and accuracy.

Contribution

It introduces a novel method to define curriculum difficulty based on inter-evaluator disagreement in crowdsourced labels for speech emotion recognition.

Findings

01

Curriculum based on evaluator disagreement improves model performance.

02

Significant gains over non-curriculum training methods.

03

Applicable to regression, binary, and multi-class emotion recognition tasks.

Abstract

This study introduces a method to design a curriculum for machine-learning to maximize the efficiency during the training process of deep neural networks (DNNs) for speech emotion recognition. Previous studies in other machine-learning problems have shown the benefits of training a classifier following a curriculum where samples are gradually presented in increasing level of difficulty. For speech emotion recognition, the challenge is to establish a natural order of difficulty in the training set to create the curriculum. We address this problem by assuming that ambiguous samples for humans are also ambiguous for computers. Speech samples are often annotated by multiple evaluators to account for differences in emotion perception across individuals. While some sentences with clear emotional content are consistently annotated, sentences with more ambiguous emotional content present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.