FedSWA: Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging

Liu junkang; Yuanyuan Liu; Fanhua Shang; Hongying Liu; Jin Liu; Wei Feng

arXiv:2507.20016·cs.LG·April 21, 2026

FedSWA: Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging

Liu junkang, Yuanyuan Liu, Fanhua Shang, Hongying Liu, Jin Liu, Wei Feng

PDF

1 Repo

TL;DR

This paper introduces FedSWA and FedMoSWA, two federated learning algorithms designed to enhance generalization in highly heterogeneous data settings by finding flatter minima and aligning local and global models.

Contribution

The paper proposes novel federated learning algorithms, FedSWA and FedMoSWA, with theoretical analysis and empirical validation showing improved generalization and convergence in heterogeneous data scenarios.

Findings

01

FedSWA outperforms FedSAM in highly heterogeneous data.

02

FedMoSWA achieves smaller optimization and generalization errors.

03

Experimental results on CIFAR10/100 and Tiny ImageNet validate the effectiveness.

Abstract

For federated learning (FL) algorithms such as FedSAM, their generalization capability is crucial for real-word applications. In this paper, we revisit the generalization problem in FL and investigate the impact of data heterogeneity on FL generalization. We find that FedSAM usually performs worse than FedAvg in the case of highly heterogeneous data, and thus propose a novel and effective federated learning algorithm with Stochastic Weight Averaging (called \texttt{FedSWA}), which aims to find flatter minima in the setting of highly heterogeneous data. Moreover, we introduce a new momentum-based stochastic controlled weight averaging FL algorithm (\texttt{FedMoSWA}), which is designed to better align local and global models. Theoretically, we provide both convergence analysis and generalization bounds for \texttt{FedSWA} and \texttt{FedMoSWA}. We also prove that the optimization and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junkangLiu0/FedSWA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.