INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of   Progress in Speech Emotion Recognition

Andreas Triantafyllopoulos; Anton Batliner; Simon Rampp; Manuel; Milling; Bj\"orn Schuller

arXiv:2406.06401·cs.CL·April 11, 2025

INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition

Andreas Triantafyllopoulos, Anton Batliner, Simon Rampp, Manuel, Milling, Bj\"orn Schuller

PDF

Open Access 1 Repo

TL;DR

This paper revisits the 2009 INTERSPEECH Emotion Challenge, benchmarking recent deep learning models against historical results, revealing that progress in speech emotion recognition remains challenging and not always consistent.

Contribution

It provides a comprehensive evaluation of recent deep learning models on a historic benchmark, highlighting the slow and non-monotonic progress in speech emotion recognition.

Findings

01

Most models perform close to the baseline

02

Hyperparameter tuning marginally improves results

03

Recent methods do not consistently outperform older approaches

Abstract

We revisit the INTERSPEECH 2009 Emotion Challenge -- the first ever speech emotion recognition (SER) challenge -- and evaluate a series of deep learning models that are representative of the major advances in SER research in the time since then. We start by training each model using a fixed set of hyperparameters, and further fine-tune the best-performing models of that initial setup with a grid search. Results are always reported on the official test set with a separate validation set only used for early stopping. Most models score below or close to the official baseline, while they marginally outperform the original challenge winners after hyperparameter tuning. Our work illustrates that, despite recent progress, FAU-AIBO remains a very challenging benchmark. An interesting corollary is that newer methods do not consistently outperform older ones, showing that progress towards…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ATriantafyllopoulos/is24-interspeech09-ser-revisited
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

MethodsSparse Evolutionary Training