FedSampling: A Better Sampling Strategy for Federated Learning

Tao Qi; Fangzhao Wu; Lingjuan Lyu; Yongfeng Huang; and Xing Xie

arXiv:2306.14245·cs.LG·June 27, 2023

FedSampling: A Better Sampling Strategy for Federated Learning

Tao Qi, Fangzhao Wu, Lingjuan Lyu, Yongfeng Huang, and Xing Xie

PDF

Open Access

TL;DR

FedSampling introduces a novel client data sampling strategy for federated learning that accounts for data size imbalance and preserves privacy, leading to improved model performance.

Contribution

The paper proposes FedSampling, a data sampling method that enhances federated learning by considering data size imbalance and ensuring differential privacy.

Findings

01

Improved model accuracy on benchmark datasets

02

Effective handling of data size imbalance across clients

03

Privacy-preserving total sample size estimation

Abstract

Federated learning (FL) is an important technique for learning models from decentralized data in a privacy-preserving way. Existing FL methods usually uniformly sample clients for local model learning in each round. However, different clients may have significantly different data sizes, and the clients with more data cannot have more opportunities to contribute to model training, which may lead to inferior performance. In this paper, instead of client uniform sampling, we propose a novel data uniform sampling strategy for federated learning (FedSampling), which can effectively improve the performance of federated learning especially when client data size distribution is highly imbalanced across clients. In each federated learning round, local data on each client is randomly sampled for local model learning according to a probability based on the server desired sample size and the total…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data