Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning
Vasileios Tsouvalas, Tanir Ozcelebi, Nirvana Meratnia

TL;DR
This paper introduces a novel privacy-preserving speech emotion recognition method using federated learning combined with self-training, enabling effective emotion detection with limited labeled data while protecting user privacy.
Contribution
It is the first to integrate self-training with federated learning for speech emotion recognition, enhancing data efficiency and privacy preservation.
Findings
Achieves generalizable SER models with as little as 10% labeled data.
Improves recognition accuracy by 8.67% over fully-supervised federated models.
Performs well under highly non-i.i.d. data distributions.
Abstract
Speech Emotion Recognition (SER) refers to the recognition of human emotions from natural speech. If done accurately, it can offer a number of benefits in building human-centered context-aware intelligent systems. Existing SER approaches are largely centralized, without considering users' privacy. Federated Learning (FL) is a distributed machine learning paradigm dealing with decentralization of privacy-sensitive personal data. In this paper, we present a privacy-preserving and data-efficient SER approach by utilizing the concept of FL. To the best of our knowledge, this is the first federated SER approach, which utilizes self-training learning in conjunction with federated learning to exploit both labeled and unlabeled on-device data. Our experimental evaluations on the IEMOCAP dataset shows that our federated approach can learn generalizable SER models even under low availability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Indoor and Outdoor Localization Technologies
