Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo   Multilabel

Jin Li; Nan Yan; Lan Wang

arXiv:2108.08663·eess.AS·October 8, 2021·ASRU

Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel

Jin Li, Nan Yan, Lan Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised cross-lingual speech emotion recognition method using pseudo multilabels and external memory, significantly improving accuracy across multiple low-resource languages.

Contribution

It proposes a novel neural network approach with external memory and pseudo multilabel generation for cross-lingual SER, addressing domain differences without labeled target data.

Findings

01

Significant accuracy improvements on Urdu, Skropus, ShEMO, and EMO-DB datasets.

02

Effective cross-lingual transfer without target domain labels.

03

Code availability facilitates further research.

Abstract

Speech Emotion Recognition (SER) in a single language has achieved remarkable results through deep learning approaches in the last decade. However, cross-lingual SER remains a challenge in real-world applications due to a great difference between the source and target domain distributions. To address this issue, we propose an unsupervised cross-lingual Neural Network with Pseudo Multilabel (NNPM) that is trained to learn the emotion similarities between source domain features inside an external memory adjusted to identify emotion in cross-lingual databases. NNPM introduces a novel approach that leverages external memory to store source domain features and generates pseudo multilabel for each target domain data by computing the similarities between the external memory and the target domain features. We evaluate our approach on multiple different languages of speech emotion databases.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

happyjin/nnpm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Speech and Audio Processing · Sentiment Analysis and Opinion Mining