Federated Learning for ASR based on Wav2vec 2.0
Tuan Nguyen, Salima Mdhaffar, Natalia Tomashenko, Jean-Fran\c{c}ois, Bonastre, Yannick Est\`eve

TL;DR
This paper explores federated learning for training a wav2vec 2.0 based ASR model on TED-LIUM 3 data, achieving competitive word error rates while preserving user privacy and analyzing speaker identity leakage.
Contribution
It demonstrates the effectiveness of federated learning for ASR with wav2vec 2.0 and provides insights into privacy preservation and speaker identity information leakage.
Findings
Achieved 10.92% WER on TED-LIUM 3 test set without language models.
Analyzed layer-wise information leakage of speaker identity.
Federated learning can effectively train ASR models while protecting user data.
Abstract
This paper presents a study on the use of federated learning to train an ASR model based on a wav2vec 2.0 model pre-trained by self supervision. Carried out on the well-known TED-LIUM 3 dataset, our experiments show that such a model can obtain, with no use of a language model, a word error rate of 10.92% on the official TED-LIUM 3 test set, without sharing any data from the different users. We also analyse the ASR performance for speakers depending to their participation to the federated learning. Since federated learning was first introduced for privacy purposes, we also measure its ability to protect speaker identity. To do that, we exploit an approach to analyze information contained in exchanged models based on a neural network footprint on an indicator dataset. This analysis is made layer-wise and shows which layers in an exchanged wav2vec 2.0 based model bring the speaker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection
MethodsTest
