Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge
Chaoyang He, Murali Annavaram, Salman Avestimehr

TL;DR
This paper introduces FedGKT, a federated learning method that trains small CNNs on edge devices and periodically transfers knowledge to a large server model, significantly reducing computational and communication costs while maintaining accuracy.
Contribution
FedGKT reformulates federated learning as a group knowledge transfer approach, enabling efficient training of large CNNs on resource-limited edge devices with minimal accuracy loss.
Findings
FedGKT achieves comparable or higher accuracy than FedAvg.
Reduces edge device computation by 9-17 times.
Requires 54-105 times fewer parameters in edge CNNs.
Abstract
Scaling up the convolutional neural network (CNN) size (e.g., width, depth, etc.) is known to effectively improve model accuracy. However, the large model size impedes training on resource-constrained edge devices. For instance, federated learning (FL) may place undue burden on the compute capability of edge nodes, even though there is a strong practical need for FL due to its privacy and confidentiality properties. To address the resource-constrained reality of edge devices, we reformulate FL as a group knowledge transfer training algorithm, called FedGKT. FedGKT designs a variant of the alternating minimization approach to train small CNNs on edge nodes and periodically transfer their knowledge by knowledge distillation to a large server-side CNN. FedGKT consolidates several advantages into a single framework: reduced demand for edge computation, lower communication bandwidth for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
MethodsKnowledge Distillation
