On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks
Usevalad Milasheuski, Luca Barbieri, Bernardo Camajori Tedeschini,, Monica Nicoli, Stefano Savazzi

TL;DR
This paper investigates how data heterogeneity affects federated learning in healthcare, providing a formal taxonomy, evaluating algorithms, and benchmarking their performance on medical datasets to guide algorithm selection.
Contribution
It offers a comprehensive formalization of heterogeneity in federated learning, benchmarks seven algorithms on medical data, and provides guidelines for healthcare applications.
Findings
Heterogeneity significantly impacts FL performance in healthcare.
Benchmark results identify the most robust algorithms for medical data.
Guidelines for selecting FL algorithms based on data heterogeneity.
Abstract
Federated Learning (FL) allows multiple privacy-sensitive applications to leverage their dataset for a global model construction without any disclosure of the information. One of those domains is healthcare, where groups of silos collaborate in order to generate a global predictor with improved accuracy and generalization. However, the inherent challenge lies in the high heterogeneity of medical data, necessitating sophisticated techniques for assessment and compensation. This paper presents a comprehensive exploration of the mathematical formalization and taxonomy of heterogeneity within FL environments, focusing on the intricacies of medical data. In particular, we address the evaluation and comparison of the most popular FL algorithms with respect to their ability to cope with quantity-based, feature and label distribution-based heterogeneity. The goal is to provide a quantitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks
MethodsSparse Evolutionary Training
