Few-shot Learning in Emotion Recognition of Spontaneous Speech Using a   Siamese Neural Network with Adaptive Sample Pair Formation

Kexin Feng; Theodora Chaspari

arXiv:2109.02915·cs.LG·September 8, 2021

Few-shot Learning in Emotion Recognition of Spontaneous Speech Using a Siamese Neural Network with Adaptive Sample Pair Formation

Kexin Feng, Theodora Chaspari

PDF

TL;DR

This paper introduces a few-shot learning method using a Siamese neural network to recognize emotions in spontaneous speech with limited labeled data, outperforming traditional adaptation techniques.

Contribution

It presents a novel metric learning approach for emotion recognition in spontaneous speech, effective with few labeled samples, and demonstrates its superiority over existing methods.

Findings

01

Effective emotion recognition with few labeled samples.

02

Superior performance over fine-tuning and adversarial adaptation.

03

Feasibility demonstrated across four datasets.

Abstract

Speech-based machine learning (ML) has been heralded as a promising solution for tracking prosodic and spectrotemporal patterns in real-life that are indicative of emotional changes, providing a valuable window into one's cognitive and mental state. Yet, the scarcity of labelled data in ambulatory studies prevents the reliable training of ML models, which usually rely on "data-hungry" distribution-based learning. Leveraging the abundance of labelled speech data from acted emotions, this paper proposes a few-shot learning approach for automatically recognizing emotion in spontaneous speech from a small number of labelled samples. Few-shot learning is implemented via a metric learning approach through a siamese neural network, which models the relative distance between samples rather than relying on learning absolute patterns of the corresponding distributions of each emotion. Results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.