FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios

Tianle Li; Yongzhi Huang; Linshan Jiang; Qipeng Xie; Chang Liu; Wenfeng Du; Lu Wang; and Kaishun Wu

arXiv:2507.14980·cs.LG·July 22, 2025

FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios

Tianle Li, Yongzhi Huang, Linshan Jiang, Qipeng Xie, Chang Liu, Wenfeng Du, Lu Wang, and Kaishun Wu

PDF

TL;DR

FedWCM introduces a dynamic momentum adjustment technique for federated learning, effectively addressing convergence issues caused by long-tailed data distributions and class imbalance, thereby improving model performance and training stability.

Contribution

The paper proposes FedWCM, a novel momentum-based federated learning method that adaptively corrects biases in long-tailed data scenarios, enhancing convergence and accuracy.

Findings

01

FedWCM resolves non-convergence in long-tailed FL scenarios.

02

FedWCM outperforms existing methods in accuracy and efficiency.

03

Layer-wise analysis reveals biases caused by data imbalance.

Abstract

Federated Learning (FL) enables decentralized model training while preserving data privacy. Despite its benefits, FL faces challenges with non-identically distributed (non-IID) data, especially in long-tailed scenarios with imbalanced class samples. Momentum-based FL methods, often used to accelerate FL convergence, struggle with these distributions, resulting in biased models and making FL hard to converge. To understand this challenge, we conduct extensive investigations into this phenomenon, accompanied by a layer-wise analysis of neural network behavior. Based on these insights, we propose FedWCM, a method that dynamically adjusts momentum using global and per-round data to correct directional biases introduced by long-tailed distributions. Extensive experiments show that FedWCM resolves non-convergence issues and outperforms existing methods, enhancing FL's efficiency and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.