On the Impact of Data Heterogeneity in Federated Learning Environments   with Application to Healthcare Networks

Usevalad Milasheuski; Luca Barbieri; Bernardo Camajori Tedeschini,; Monica Nicoli; Stefano Savazzi

arXiv:2404.18519·cs.LG·September 6, 2024

On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks

Usevalad Milasheuski, Luca Barbieri, Bernardo Camajori Tedeschini,, Monica Nicoli, Stefano Savazzi

PDF

Open Access

TL;DR

This paper investigates how data heterogeneity affects federated learning in healthcare, providing a formal taxonomy, evaluating algorithms, and benchmarking their performance on medical datasets to guide algorithm selection.

Contribution

It offers a comprehensive formalization of heterogeneity in federated learning, benchmarks seven algorithms on medical data, and provides guidelines for healthcare applications.

Findings

01

Heterogeneity significantly impacts FL performance in healthcare.

02

Benchmark results identify the most robust algorithms for medical data.

03

Guidelines for selecting FL algorithms based on data heterogeneity.

Abstract

Federated Learning (FL) allows multiple privacy-sensitive applications to leverage their dataset for a global model construction without any disclosure of the information. One of those domains is healthcare, where groups of silos collaborate in order to generate a global predictor with improved accuracy and generalization. However, the inherent challenge lies in the high heterogeneity of medical data, necessitating sophisticated techniques for assessment and compensation. This paper presents a comprehensive exploration of the mathematical formalization and taxonomy of heterogeneity within FL environments, focusing on the intricacies of medical data. In particular, we address the evaluation and comparison of the most popular FL algorithms with respect to their ability to cope with quantity-based, feature and label distribution-based heterogeneity. The goal is to provide a quantitative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks

MethodsSparse Evolutionary Training