Federated Learning with Matched Averaging

Hongyi Wang; Mikhail Yurochkin; Yuekai Sun; Dimitris Papailiopoulos,; Yasaman Khazaeni

arXiv:2002.06440·cs.LG·February 18, 2020·102 cites

Federated Learning with Matched Averaging

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos,, Yasaman Khazaeni

PDF

Open Access 1 Repo

TL;DR

This paper introduces FedMA, a federated learning algorithm that constructs global neural network models layer-wise by matching and averaging features, leading to improved accuracy and reduced communication costs.

Contribution

FedMA is a novel layer-wise matching and averaging algorithm for federated learning of neural networks, outperforming existing methods on CNNs and LSTMs.

Findings

01

FedMA outperforms state-of-the-art federated learning algorithms.

02

FedMA reduces communication overhead in federated training.

03

Effective for CNNs and LSTMs on real-world datasets.

Abstract

Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA constructs the shared global model in a layer-wise manner by matching and averaging hidden elements (i.e. channels for convolution layers; hidden states for LSTM; neurons for fully connected layers) with similar feature extraction signatures. Our experiments indicate that FedMA not only outperforms popular state-of-the-art federated learning algorithms on deep CNN and LSTM architectures trained on real world datasets, but also reduces the overall communication burden.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IBM/FedMA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques · Stochastic Gradient Optimization Techniques

MethodsSigmoid Activation · Tanh Activation · Convolution · Long Short-Term Memory