# Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech   Emotion Recognition

**Authors:** Siddique Latif, Junaid Qadir, and Muhammad Bilal

arXiv: 1907.06083 · 2020-07-29

## TL;DR

This paper introduces a GAN-based unsupervised domain adaptation model for cross-lingual speech emotion recognition, effectively handling language differences without requiring labeled data, and demonstrates significant performance improvements across multiple languages.

## Contribution

The paper presents a novel GAN-based approach for learning language-invariant features in speech emotion recognition without target language labels, addressing a key challenge in cross-lingual SER.

## Key findings

- Significant performance improvements on four multilingual datasets.
- Effective in low-resource languages like Urdu.
- No need for labeled target-language data.

## Abstract

Cross-lingual speech emotion recognition (SER) is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that can learn language invariant representations without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06083/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06083/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/1907.06083/full.md

---
Source: https://tomesphere.com/paper/1907.06083