Modeling Feature Representations for Affective Speech using Generative Adversarial Networks
Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

TL;DR
This paper explores the use of various GAN architectures to generate realistic feature vectors for different emotions in speech, aiming to improve emotion recognition especially in low-resource scenarios.
Contribution
It introduces novel GAN-based methods for generating emotion-specific speech features and evaluates their effectiveness in enhancing emotion recognition models.
Findings
GAN-generated features are realistic and diverse.
Synthetic data improves emotion recognition accuracy in low-resource settings.
Proposed metrics effectively assess GAN performance in this context.
Abstract
Emotion recognition is a classic field of research with a typical setup extracting features and feeding them through a classifier for prediction. On the other hand, generative models jointly capture the distributional relationship between emotions and the feature profiles. Relatively recently, Generative Adversarial Networks (GANs) have surfaced as a new class of generative models and have shown considerable success in modeling distributions in the fields of computer vision and natural language understanding. In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior. Each mixture component corresponds to an emotional class and can be sampled to generate features from the corresponding emotion. (ii) A one-hot vector corresponding to an emotion can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Dogecoin Customer Service Number +1-833-534-1729
