FedDKD: Federated Learning with Decentralized Knowledge Distillation
Xinjia Li, Boyu Chen, Wenlian Lu

TL;DR
FedDKD introduces a decentralized knowledge distillation framework for federated learning, improving global model performance and communication efficiency on heterogeneous datasets without sharing raw data.
Contribution
The paper proposes FedDKD, a novel federated learning approach using decentralized knowledge distillation to better align local and global models without data sharing.
Findings
Outperforms state-of-the-art methods on heterogeneous datasets
Requires fewer communication steps for training
Effective in extremely heterogeneous data scenarios
Abstract
The performance of federated learning in neural networks is generally influenced by the heterogeneity of the data distribution. For a well-performing global model, taking a weighted average of the local models, as done by most existing federated learning algorithms, may not guarantee consistency with local models in the space of neural network maps. In this paper, we propose a novel framework of federated learning equipped with the process of decentralized knowledge distillation (FedDKD) (i.e., without data on the server). The FedDKD introduces a module of decentralized knowledge distillation (DKD) to distill the knowledge of the local models to train the global model by approaching the neural network map average based on the metric of divergence defined in the loss function, other than only averaging parameters as done in literature. Numeric experiments on various heterogeneous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
MethodsKnowledge Distillation
