Communication-Efficient Federated Distillation with Active Data Sampling
Lumin Liu, Jun Zhang, S. H. Song, Khaled B. Letaief

TL;DR
This paper introduces a unified framework and theoretical analysis for federated distillation, proposing an active data sampling algorithm that reduces communication costs and enhances model performance in federated learning.
Contribution
It presents a generic meta-algorithm for federated distillation, supported by empirical and theoretical insights, and introduces an active data sampling method to improve efficiency and effectiveness.
Findings
Significant reduction in communication overhead compared to traditional methods.
The proposed algorithm achieves comparable or better model performance.
Empirical results on benchmark datasets validate the effectiveness of the approach.
Abstract
Federated learning (FL) is a promising paradigm to enable privacy-preserving deep learning from distributed data. Most previous works are based on federated average (FedAvg), which, however, faces several critical issues, including a high communication overhead and the difficulty in dealing with heterogeneous model architectures. Federated Distillation (FD) is a recently proposed alternative to enable communication-efficient and robust FL, which achieves orders of magnitude reduction of the communication overhead compared with FedAvg and is flexible to handle heterogeneous models at the clients. However, so far there is no unified algorithmic framework or theoretical analysis for FD-based methods. In this paper, we first present a generic meta-algorithm for FD and investigate the influence of key parameters through empirical experiments. Then, we verify the empirical observations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Traffic Prediction and Management Techniques
