FedDKD: Federated Learning with Decentralized Knowledge Distillation

Xinjia Li; Boyu Chen; Wenlian Lu

arXiv:2205.00706·cs.LG·May 3, 2022

FedDKD: Federated Learning with Decentralized Knowledge Distillation

Xinjia Li, Boyu Chen, Wenlian Lu

PDF

Open Access

TL;DR

FedDKD introduces a decentralized knowledge distillation framework for federated learning, improving global model performance and communication efficiency on heterogeneous datasets without sharing raw data.

Contribution

The paper proposes FedDKD, a novel federated learning approach using decentralized knowledge distillation to better align local and global models without data sharing.

Findings

01

Outperforms state-of-the-art methods on heterogeneous datasets

02

Requires fewer communication steps for training

03

Effective in extremely heterogeneous data scenarios

Abstract

The performance of federated learning in neural networks is generally influenced by the heterogeneity of the data distribution. For a well-performing global model, taking a weighted average of the local models, as done by most existing federated learning algorithms, may not guarantee consistency with local models in the space of neural network maps. In this paper, we propose a novel framework of federated learning equipped with the process of decentralized knowledge distillation (FedDKD) (i.e., without data on the server). The FedDKD introduces a module of decentralized knowledge distillation (DKD) to distill the knowledge of the local models to train the global model by approaching the neural network map average based on the metric of divergence defined in the loss function, other than only averaging parameters as done in literature. Numeric experiments on various heterogeneous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM

MethodsKnowledge Distillation