Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics,   Methods, Frameworks and Future Directions

Daniel M. Jimenez G.; David Solans; Mikko Heikkila; Andrea Vitaletti,; Nicolas Kourtellis; Aris Anagnostopoulos; Ioannis Chatzigiannakis

arXiv:2411.12377·cs.LG·December 13, 2024·3 cites

Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics, Methods, Frameworks and Future Directions

Daniel M. Jimenez G., David Solans, Mikko Heikkila, Andrea Vitaletti,, Nicolas Kourtellis, Aris Anagnostopoulos, Ioannis Chatzigiannakis

PDF

Open Access

TL;DR

This survey comprehensively reviews the challenges non-IID data poses to federated learning, including taxonomy, metrics, solutions, and future research directions, highlighting the need for standardized frameworks and better understanding.

Contribution

It provides a detailed taxonomy, metrics, and frameworks for addressing non-IID data in federated learning, filling a gap in current research and guiding future studies.

Findings

01

Taxonomy for non-IID data and partition protocols

02

Metrics for quantifying data heterogeneity

03

Overview of solutions and frameworks for non-IID FL

Abstract

Recent advances in machine learning have highlighted Federated Learning (FL) as a promising approach that enables multiple distributed users (so-called clients) to collectively train ML models without sharing their private data. While this privacy-preserving method shows potential, it struggles when data across clients is not independent and identically distributed (non-IID) data. The latter remains an unsolved challenge that can result in poorer model performance and slower training times. Despite the significance of non-IID data in FL, there is a lack of consensus among researchers about its classification and quantification. This technical survey aims to fill that gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics to quantify data heterogeneity. Additionally, we describe popular solutions to address non-IID data and standardized frameworks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data