Demystifying the Effects of Non-Independence in Federated Learning
Stefan Arnold, Dilara Yesilbas

TL;DR
This paper investigates how non-independent data sampling methods, like block-cyclic sampling, impact federated learning's accuracy, fairness, and convergence, revealing robustness to some patterns but significant degradation with others.
Contribution
It provides an empirical analysis of the effects of block-cyclic sampling and unbalanced data distributions on federated learning performance.
Findings
Robustness to two-block cyclic sampling over time zones.
Performance drops up to 26% with multi-block dependent sampling.
Unbalanced block distributions further impair convergence and accuracy.
Abstract
Federated Learning (FL) enables statistical models to be built on user-generated data without compromising data security and user privacy. For this reason, FL is well suited for on-device learning from mobile devices where data is abundant and highly privatized. Constrained by the temporal availability of mobile devices, only a subset of devices is accessible to participate in the iterative protocol consisting of training and aggregation. In this study, we take a step toward better understanding the effect of non-independent data distributions arising from block-cyclic sampling. By conducting extensive experiments on visual classification, we measure the effects of block-cyclic sampling (both standalone and in combination with non-balanced block distributions). Specifically, we measure the alterations induced by block-cyclic sampling from the perspective of accuracy, fairness, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Wireless Communication Security Techniques · Mobile Crowdsensing and Crowdsourcing
