Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features

Unzela Talpur; Zafi Sherhan Syed; Muhammad Shehram Shah Syed; Abbas Shah Syed

arXiv:2510.26823·cs.SD·November 3, 2025

Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features

Unzela Talpur, Zafi Sherhan Syed, Muhammad Shehram Shah Syed, Abbas Shah Syed

PDF

Open Access

TL;DR

This study evaluates the generalization of speech emotion recognition models for Urdu across multiple datasets using domain-knowledge acoustic features, highlighting the importance of cross-corpus validation for low-resource languages.

Contribution

It introduces a cross-corpus evaluation framework for Urdu SER and demonstrates the limitations of self-corpus validation in assessing model robustness.

Findings

01

Cross-corpus evaluation yields lower UAR than self-corpus validation by up to 13%.

02

Domain-knowledge acoustic features effectively represent speech signals for Urdu SER.

03

Cross-corpus validation provides a more realistic assessment of model performance.

Abstract

Speech Emotion Recognition (SER) is a key affective computing technology that enables emotionally intelligent artificial intelligence. While SER is challenging in general, it is particularly difficult for low-resource languages such as Urdu. This study investigates Urdu SER in a cross-corpus setting, an area that has remained largely unexplored. We employ a cross-corpus evaluation framework across three different Urdu emotional speech datasets to test model generalization. Two standard domain-knowledge based acoustic feature sets, eGeMAPS and ComParE, are used to represent speech signals as feature vectors which are then passed to Logistic Regression and Multilayer Perceptron classifiers. Classification performance is assessed using unweighted average recall (UAR) whilst considering class-label imbalance. Results show that Self-corpus validation often overestimates performance, with UAR…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Music and Audio Processing