TL;DR
This paper introduces F-MAML, a modified meta-learning algorithm designed to improve few-shot multilingual speech emotion recognition, especially in scenarios with limited data and less common languages.
Contribution
The paper proposes F-MAML, an adaptation of MAML that handles fixed and new classes simultaneously, enhancing few-shot speech emotion recognition across multiple languages.
Findings
F-MAML outperforms original MAML on EmoFilm dataset.
The approach effectively handles multilingual emotion recognition with limited data.
F-MAML demonstrates improved accuracy in few-shot scenarios.
Abstract
In this paper, we analyze the feasibility of applying few-shot learning to speech emotion recognition task (SER). The current speech emotion recognition models work exceptionally well but fail when then input is multilingual. Moreover, when training such models, the models' performance is suitable only when the training corpus is vast. This availability of a big training corpus is a significant problem when choosing a language that is not much popular or obscure. We attempt to solve this challenge of multilingualism and lack of available data by turning this problem into a few-shot learning problem. We suggest relaxing the assumption that all N classes in an N-way K-shot problem be new and define an N+F way problem where N and F are the number of emotion classes and predefined fixed classes, respectively. We propose this modification to the Model-Agnostic MetaLearning (MAML) algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsModel-Agnostic Meta-Learning
