Dataset Distillation-based Hybrid Federated Learning on Non-IID Data
Xiufang Shi, Wei Zhang, Yuheng Li, Mincheng Wu, Zhenyu Wen, Shibo He, Tejal Shah, Rajiv Ranjan

TL;DR
This paper introduces HFLDD, a hybrid federated learning framework using dataset distillation to mitigate non-IID data issues, especially label imbalance, resulting in improved accuracy and reduced communication costs.
Contribution
The paper proposes a novel hybrid federated learning framework that employs dataset distillation to handle non-IID data, particularly label skew, by clustering clients and training on distilled data.
Findings
HFLDD improves test accuracy on non-IID datasets.
HFLDD reduces communication overhead compared to baseline methods.
HFLDD effectively handles label imbalance in federated learning.
Abstract
In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (non-IID) data. To address the issue of label distribution skew, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distillation to generate approximately independent and equally distributed (IID) data, thereby improving the performance of model training. In particular, we partition the clients into heterogeneous clusters, where the data labels among different clients within a cluster are unbalanced while the data labels among different clusters are balanced. The cluster heads collect distilled data from the corresponding cluster members, and conduct model training in collaboration with the server. This training process is like traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
