Fast Server Learning Rate Tuning for Coded Federated Dropout

Giacomo Verardo; Daniel Barreira; Marco Chiesa; Dejan Kostic and; Gerald Q. Maguire Jr

arXiv:2201.11036·cs.LG·September 16, 2022

Fast Server Learning Rate Tuning for Coded Federated Dropout

Giacomo Verardo, Daniel Barreira, Marco Chiesa, Dejan Kostic and, Gerald Q. Maguire Jr

PDF

Open Access

TL;DR

This paper introduces a coding theory-based method to improve federated dropout in federated learning, enabling faster training with less bandwidth while maintaining high accuracy.

Contribution

It proposes a novel coding approach for federated dropout and demonstrates how tuning the server learning rate accelerates convergence without sacrificing accuracy.

Findings

01

Achieves 99.6% of no-dropout accuracy on EMNIST

02

Requires 2.43 times less bandwidth

03

Enables faster training convergence

Abstract

In cross-device Federated Learning (FL), clients with low computational power train a common\linebreak[4] machine model by exchanging parameters via updates instead of potentially private data. Federated Dropout (FD) is a technique that improves the communication efficiency of a FL session by selecting a \emph{subset} of model parameters to be updated in each training round. However, compared to standard FL, FD produces considerably lower accuracy and faces a longer convergence time. In this paper, we leverage \textit{coding theory} to enhance FD by allowing different sub-models to be used at each client. We also show that by carefully tuning the server learning rate hyper-parameter, we can achieve higher training speed while also achieving up to the same final accuracy as the no dropout case. For the EMNIST dataset, our mechanism achieves 99.6\% of the final accuracy of the no dropout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Age of Information Optimization · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Dropout