Dual-Distilled Heterogeneous Federated Learning with Adaptive Margins for Trainable Global Prototypes

Fatema Siddika; Md Anwar Hossen; Wensheng Zhang; Anuj Sharma; Juan Pablo Mu\~noz; Ali Jannesari

arXiv:2508.19009·cs.LG·December 22, 2025

Dual-Distilled Heterogeneous Federated Learning with Adaptive Margins for Trainable Global Prototypes

Fatema Siddika, Md Anwar Hossen, Wensheng Zhang, Anuj Sharma, Juan Pablo Mu\~noz, Ali Jannesari

PDF

TL;DR

This paper introduces FedProtoKD, a novel federated learning framework that uses adaptive margins and dual knowledge distillation to improve prototype aggregation, addressing margin shrinking and heterogeneity issues in HFL.

Contribution

The paper proposes a dual-knowledge distillation approach with adaptive margins to enhance prototype-based HFL, significantly improving accuracy over existing methods.

Findings

01

FedProtoKD improves test accuracy by up to 34.13%.

02

The framework effectively addresses prototype margin shrinking.

03

It outperforms state-of-the-art HFL methods.

Abstract

Heterogeneous Federated Learning (HFL) has gained significant attention for its capacity to handle both model and data heterogeneity across clients. Prototype-based HFL methods emerge as a promising solution to address statistical and model heterogeneity as well as privacy challenges, paving the way for new advancements in HFL research. This method focuses on sharing class-representative prototypes among heterogeneous clients. However, aggregating these prototypes via standard weighted averaging often yields sub-optimal global knowledge. Specifically, the averaging approach induces a shrinking of the aggregated prototypes' decision margins, thereby degrading model performance in scenarios with model heterogeneity and non-IID data distributions. The propose FedProtoKD in a Heterogeneous Federated Learning setting, utilizing an enhanced dual-knowledge distillation mechanism to enhance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.