Understanding Federated Learning from IID to Non-IID dataset: An Experimental Study
Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong

TL;DR
This paper investigates the impact of non-IID data on federated learning performance, identifying loss landscape inconsistencies as a key issue and categorizing existing solutions into two main strategies.
Contribution
It provides a comprehensive analysis of non-IID challenges in federated learning, revealing the primary cause and classifying existing methods into two strategic groups.
Findings
Inconsistencies in client loss landscapes cause performance drops in non-IID FL.
Existing methods mainly adjust parameter paths or modify loss landscapes.
Understanding these strategies guides future FL research.
Abstract
As privacy concerns and data regulations grow, federated learning (FL) has emerged as a promising approach for training machine learning models across decentralized data sources without sharing raw data. However, a significant challenge in FL is that client data are often non-IID (non-independent and identically distributed), leading to reduced performance compared to centralized learning. While many methods have been proposed to address this issue, their underlying mechanisms are often viewed from different perspectives. Through a comprehensive investigation from gradient descent to FL, and from IID to non-IID data settings, we find that inconsistencies in client loss landscapes primarily cause performance degradation in non-IID scenarios. From this understanding, we observe that existing methods can be grouped into two main strategies: (i) adjusting parameter update paths and (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Big Data and Digital Economy
