Cross-Corpus Multilingual Speech Emotion Recognition: Amharic vs. Other Languages
Ephrem Afele Retta, Richard Sutcliffe, Jabar Mahmood, Michael Abebe, Berwo, Eiad Almekhlafi, Sajjad Ahmed Khan, Shehzad Ashraf Chaudhry, Mustafa, Mhamed, Jun Feng

TL;DR
This study explores cross-lingual and multilingual speech emotion recognition, demonstrating that training on multiple languages enhances performance for low-resource languages like Amharic.
Contribution
It introduces a comprehensive analysis of cross-lingual SER using multiple classifiers and datasets, highlighting the effectiveness of multilingual training for scarce-resource languages.
Findings
Multilingual training improves Amharic SER accuracy.
English and German are effective source languages for Amharic.
Using multiple non-Amharic languages yields better results than single-source training.
Abstract
In a conventional Speech emotion recognition (SER) task, a classifier for a given language is trained on a pre-existing dataset for that same language. However, where training data for a language does not exist, data from other languages can be used instead. We experiment with cross-lingual and multilingual SER, working with Amharic, English, German and URDU. For Amharic, we use our own publicly-available Amharic Speech Emotion Dataset (ASED). For English, German and Urdu we use the existing RAVDESS, EMO-DB and URDU datasets. We followed previous research in mapping labels for all datasets to just two classes, positive and negative. Thus we can compare performance on different languages directly, and combine languages for training and testing. In Experiment 1, monolingual SER trials were carried out using three classifiers, AlexNet, VGGE (a proposed variant of VGG), and ResNet50.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Speech Recognition and Synthesis
