Domain Discrepancy Aware Distillation for Model Aggregation in Federated Learning
Shangchao Su, Bin Li, Xiangyang Xue

TL;DR
This paper addresses the challenge of domain discrepancy in federated learning by proposing FedD3A, an adaptive knowledge distillation method that selectively aggregates client models based on domain similarity, improving performance across diverse datasets.
Contribution
The paper introduces FedD3A, a novel domain discrepancy aware distillation algorithm that adaptively weights client models for better aggregation in federated learning with domain shifts.
Findings
FedD3A outperforms existing methods on cross-domain datasets.
The approach effectively measures domain discrepancy without raw data.
Improves model aggregation in both cross-silo and cross-device settings.
Abstract
Knowledge distillation has recently become popular as a method of model aggregation on the server for federated learning. It is generally assumed that there are abundant public unlabeled data on the server. However, in reality, there exists a domain discrepancy between the datasets of the server domain and a client domain, which limits the performance of knowledge distillation. How to improve the aggregation under such a domain discrepancy setting is still an open problem. In this paper, we first analyze the generalization bound of the aggregation model produced from knowledge distillation for the client domains, and then describe two challenges, server-to-client discrepancy and client-to-client discrepancy, brought to the aggregation model by the domain discrepancies. Following our analysis, we propose an adaptive knowledge aggregation algorithm FedD3A based on domain discrepancy aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning
MethodsKnowledge Distillation · Attentive Walk-Aggregating Graph Neural Network
