Enabling On-Device Training of Speech Recognition Models with Federated   Dropout

Dhruv Guliani; Lillian Zhou; Changwan Ryu; Tien-Ju Yang and; Harry Zhang; Yonghui Xiao; Francoise Beaufays; Giovanni Motta

arXiv:2110.03634·cs.LG·October 8, 2021

Enabling On-Device Training of Speech Recognition Models with Federated Dropout

Dhruv Guliani, Lillian Zhou, Changwan Ryu, Tien-Ju Yang and, Harry Zhang, Yonghui Xiao, Francoise Beaufays, Giovanni Motta

PDF

Open Access

TL;DR

This paper introduces federated dropout to enable on-device training of speech recognition models, reducing communication costs and allowing dynamic model size adjustment while maintaining accuracy.

Contribution

It proposes a novel federated dropout method with layer-wise variation, improving on-device training efficiency and enabling smaller sub-models with low error rates.

Findings

01

Federated dropout effectively reduces communication and computation costs.

02

Layer-wise dropout rate variation improves model training and inference.

03

Smaller sub-models achieve low word error rates independently.

Abstract

Federated learning can be used to train machine learning models on the edge on local data that never leave devices, providing privacy by default. This presents a challenge pertaining to the communication and computation costs associated with clients' devices. These costs are strongly correlated with the size of the model being trained, and are significant for state-of-the-art automatic speech recognition models. We propose using federated dropout to reduce the size of client models while training a full-size model server-side. We provide empirical evidence of the effectiveness of federated dropout, and propose a novel approach to vary the dropout rate applied at each layer. Furthermore, we find that federated dropout enables a set of smaller sub-models within the larger model to independently have low word error rates, making it easier to dynamically adjust the size of the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Traffic Prediction and Management Techniques

MethodsDropout