SFedKD: Sequential Federated Learning with Discrepancy-Aware Multi-Teacher Knowledge Distillation

Haotian Xu; Jinrui Zhou; Xichong Zhang; Mingjun Xiao; He Sun; Yin Xu

arXiv:2507.08508·cs.LG·July 14, 2025

SFedKD: Sequential Federated Learning with Discrepancy-Aware Multi-Teacher Knowledge Distillation

Haotian Xu, Jinrui Zhou, Xichong Zhang, Mingjun Xiao, He Sun, Yin Xu

PDF

TL;DR

SFedKD introduces a discrepancy-aware multi-teacher knowledge distillation framework for sequential federated learning, effectively mitigating catastrophic forgetting and improving model performance in heterogeneous environments.

Contribution

The paper proposes a novel multi-teacher knowledge distillation approach with class-distribution-based weighting and teacher selection to enhance sequential federated learning.

Findings

01

SFedKD significantly reduces catastrophic forgetting.

02

It outperforms existing federated learning methods.

03

The teacher selection mechanism improves efficiency and knowledge coverage.

Abstract

Federated Learning (FL) is a distributed machine learning paradigm which coordinates multiple clients to collaboratively train a global model via a central server. Sequential Federated Learning (SFL) is a newly-emerging FL training framework where the global model is trained in a sequential manner across clients. Since SFL can provide strong convergence guarantees under data heterogeneity, it has attracted significant research attention in recent years. However, experiments show that SFL suffers from severe catastrophic forgetting in heterogeneous environments, meaning that the model tends to forget knowledge learned from previous clients. To address this issue, we propose an SFL framework with discrepancy-aware multi-teacher knowledge distillation, called SFedKD, which selects multiple models from the previous round to guide the current round of training. In SFedKD, we extend the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation