Regularized Federated Learning for Privacy-Preserving Dysarthric and Elderly Speech Recognition

Tao Zhong; Mengzhe Geng; Shujie Hu; Guinan Li; Xunying Liu

arXiv:2506.11069·eess.AS·June 16, 2025

Regularized Federated Learning for Privacy-Preserving Dysarthric and Elderly Speech Recognition

Tao Zhong, Mengzhe Geng, Shujie Hu, Guinan Li, Xunying Liu

PDF

Open Access

TL;DR

This paper explores regularized federated learning techniques to improve privacy-preserving speech recognition for dysarthric and elderly speakers, addressing data scarcity and heterogeneity challenges, and demonstrates significant WER improvements over baseline methods.

Contribution

It systematically investigates parameter, embedding, and novel loss-based regularization methods within federated learning for specialized speech recognition tasks.

Findings

01

Regularized FL outperforms baseline FedAvg with up to 0.55% WER reduction.

02

Increasing communication frequency improves performance towards centralized training.

03

Regularization techniques effectively handle data heterogeneity and scarcity.

Abstract

Accurate recognition of dysarthric and elderly speech remains challenging to date. While privacy concerns have driven a shift from centralized approaches to federated learning (FL) to ensure data confidentiality, this further exacerbates the challenges of data scarcity, imbalanced data distribution and speaker heterogeneity. To this end, this paper conducts a systematic investigation of regularized FL techniques for privacy-preserving dysarthric and elderly speech recognition, addressing different levels of the FL process by 1) parameter-based, 2) embedding-based and 3) novel loss-based regularization. Experiments on the benchmark UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest that regularized FL systems consistently outperform the baseline FedAvg system by statistically significant WER reductions of up to 0.55\% absolute (2.13\% relative). Further increasing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders