Adversarial Auto-encoders for Speech Based Emotion Recognition

Saurabh Sahu; Rahul Gupta; Ganesh Sivaraman; Wael AbdAlmageed; Carol; Espy-Wilson

arXiv:1806.02146·stat.ML·June 7, 2018

Adversarial Auto-encoders for Speech Based Emotion Recognition

Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael AbdAlmageed, Carol, Espy-Wilson

PDF

TL;DR

This paper explores the use of adversarial autoencoders for speech emotion recognition, focusing on compressing emotional features and generating synthetic samples to improve classifier training.

Contribution

It demonstrates the effectiveness of adversarial autoencoders in encoding emotional speech features and generating synthetic data for emotion recognition tasks.

Findings

01

Effective encoding of high-dimensional emotional features with minimal discriminability loss

02

Successful generation of synthetic emotional speech samples

03

Potential for improved emotion recognition classifier training

Abstract

Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition. They map the autoencoder's bottleneck layer output (termed as code vectors) to different noise Probability Distribution Functions (PDFs), that can be further regularized to cluster based on class information. In addition, they also allow a generation of synthetic samples by sampling the code vectors from the mapped PDFs. Inspired by these properties, we investigate the application of adversarial autoencoders to the domain of emotion recognition. Specifically, we conduct experiments on the following two aspects: (i) their ability to encode high dimensional feature vector representations for emotional utterances into a compressed space (with a minimal loss of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.