Federated Learning with Matched Averaging
Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos,, Yasaman Khazaeni

TL;DR
This paper introduces FedMA, a federated learning algorithm that constructs global neural network models layer-wise by matching and averaging features, leading to improved accuracy and reduced communication costs.
Contribution
FedMA is a novel layer-wise matching and averaging algorithm for federated learning of neural networks, outperforming existing methods on CNNs and LSTMs.
Findings
FedMA outperforms state-of-the-art federated learning algorithms.
FedMA reduces communication overhead in federated training.
Effective for CNNs and LSTMs on real-world datasets.
Abstract
Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA constructs the shared global model in a layer-wise manner by matching and averaging hidden elements (i.e. channels for convolution layers; hidden states for LSTM; neurons for fully connected layers) with similar feature extraction signatures. Our experiments indicate that FedMA not only outperforms popular state-of-the-art federated learning algorithms on deep CNN and LSTM architectures trained on real world datasets, but also reduces the overall communication burden.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques · Stochastic Gradient Optimization Techniques
MethodsSigmoid Activation · Tanh Activation · Convolution · Long Short-Term Memory
