Decoupled Federated Learning for ASR with Non-IID Data
Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

TL;DR
This paper introduces personalized federated learning approaches for automatic speech recognition that effectively handle non-IID data, reducing word error rate and significantly lowering communication and computation costs.
Contribution
It proposes two novel personalized FL methods for ASR, including a decoupled approach that shifts computation to the server and reduces communication overhead.
Findings
Personalized FL approaches reduce WER by 2.3%-3.4%.
DecoupleFL decreases communication costs by 88.6%.
DecoupleFL reduces client computation by 75%.
Abstract
Automatic speech recognition (ASR) with federated learning (FL) makes it possible to leverage data from multiple clients without compromising privacy. The quality of FL-based ASR could be measured by recognition performance, communication and computation costs. When data among different clients are not independently and identically distributed (non-IID), the performance could degrade significantly. In this work, we tackle the non-IID issue in FL-based ASR with personalized FL, which learns personalized models for each client. Concretely, we propose two types of personalized FL approaches for ASR. Firstly, we adapt the personalization layer based FL for ASR, which keeps some layers locally to learn personalization models. Secondly, to reduce the communication and computation costs, we propose decoupled federated learning (DecoupleFL). On one hand, DecoupleFL moves the computation burden…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Privacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting
