TL;DR
FedNSAM introduces a novel federated learning algorithm that aligns local and global flatness using Nesterov momentum, improving convergence and generalization in heterogeneous data settings.
Contribution
It redefines flatness in FL, analyzes flatness distance, and proposes FedNSAM with Nesterov momentum to enhance flatness consistency and convergence.
Findings
FedNSAM achieves better generalization than existing methods.
Theoretical analysis shows tighter convergence bounds for FedNSAM.
Empirical results confirm superior performance on CNN and Transformer models.
Abstract
In federated learning (FL), multi-step local updates and data heterogeneity usually lead to sharper global minima, which degrades the performance of the global model. Popular FL algorithms integrate sharpness-aware minimization (SAM) into local training to address this issue. However, in the high data heterogeneity setting, the flatness in local training does not imply the flatness of the global model. Therefore, minimizing the sharpness of the local loss surfaces on the client data does not enable the effectiveness of SAM in FL to improve the generalization ability of the global model. We define the \textbf{flatness distance} to explain this phenomenon. By rethinking the SAM in FL and theoretically analyzing the \textbf{flatness distance}, we propose a novel \textbf{FedNSAM} algorithm that accelerates the SAM algorithm by introducing global Nesterov momentum into the local update to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
