EmoTale: An Enacted Speech-emotion Dataset in Danish

Maja J. Hjuler; Harald V. Skat-R{\o}rdam; Line H. Clemmensen; Sneha Das

arXiv:2508.14548·cs.CL·August 21, 2025

EmoTale: An Enacted Speech-emotion Dataset in Danish

Maja J. Hjuler, Harald V. Skat-R{\o}rdam, Line H. Clemmensen, Sneha Das

PDF

Open Access

TL;DR

EmoTale is a new Danish and English speech emotion dataset that enables improved speech emotion recognition models, demonstrating the effectiveness of self-supervised embeddings over traditional features.

Contribution

This paper introduces EmoTale, a novel Danish and English speech emotion dataset with annotations, and evaluates its utility with SER models using self-supervised embeddings.

Findings

01

Self-supervised speech embeddings outperform handcrafted features.

02

The best SER model achieves 64.1% UAR on EmoTale.

03

EmoTale's predictive power is comparable to existing Danish emotional speech data.

Abstract

While multiple emotional speech corpora exist for commonly spoken languages, there is a lack of functional datasets for smaller (spoken) languages, such as Danish. To our knowledge, Danish Emotional Speech (DES), published in 1997, is the only other database of Danish emotional speech. We present EmoTale; a corpus comprising Danish and English speech recordings with their associated enacted emotion annotations. We demonstrate the validity of the dataset by investigating and presenting its predictive power using speech emotion recognition (SER) models. We develop SER models for EmoTale and the reference datasets using self-supervised speech model (SSLM) embeddings and the openSMILE feature extractor. We find the embeddings superior to the hand-crafted features. The best model achieves an unweighted average recall (UAR) of 64.1% on the EmoTale corpus using leave-one-speaker-out…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Digital Communication and Language · Speech and dialogue systems