Privacy-preserving Speech Emotion Recognition through Semi-Supervised   Federated Learning

Vasileios Tsouvalas; Tanir Ozcelebi; Nirvana Meratnia

arXiv:2202.02611·cs.LG·February 8, 2022·1 cites

Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning

Vasileios Tsouvalas, Tanir Ozcelebi, Nirvana Meratnia

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel privacy-preserving speech emotion recognition method using federated learning combined with self-training, enabling effective emotion detection with limited labeled data while protecting user privacy.

Contribution

It is the first to integrate self-training with federated learning for speech emotion recognition, enhancing data efficiency and privacy preservation.

Findings

01

Achieves generalizable SER models with as little as 10% labeled data.

02

Improves recognition accuracy by 8.67% over fully-supervised federated models.

03

Performs well under highly non-i.i.d. data distributions.

Abstract

Speech Emotion Recognition (SER) refers to the recognition of human emotions from natural speech. If done accurately, it can offer a number of benefits in building human-centered context-aware intelligent systems. Existing SER approaches are largely centralized, without considering users' privacy. Federated Learning (FL) is a distributed machine learning paradigm dealing with decentralization of privacy-sensitive personal data. In this paper, we present a privacy-preserving and data-efficient SER approach by utilizing the concept of FL. To the best of our knowledge, this is the first federated SER approach, which utilizes self-training learning in conjunction with federated learning to exploit both labeled and unlabeled on-device data. Our experimental evaluations on the IEMOCAP dataset shows that our federated approach can learn generalizable SER models even under low availability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FederatedSTAR/FedSTAR
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Indoor and Outdoor Localization Technologies