SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data

Mingkun Yang; Ran Zhu; Qing Wang; Jie Yang

arXiv:2511.09828·cs.LG·November 18, 2025

SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data

Mingkun Yang, Ran Zhu, Qing Wang, Jie Yang

PDF

Open Access 1 Video

TL;DR

SMoFi is a novel framework for split federated learning that enhances convergence and accuracy across heterogeneous data by synchronizing momentum buffers and controlling gradient divergence.

Contribution

It introduces Step-wise Momentum Fusion (SMoFi), a lightweight method to mitigate gradient divergence in split federated learning with data heterogeneity.

Findings

01

Improves global model accuracy by up to 7.1%.

02

Speeds up convergence by up to 10.25 times.

03

More effective with increased clients and deeper models.

Abstract

Split Federated Learning is a system-efficient federated learning paradigm that leverages the rich computing resources at a central server to train model partitions. Data heterogeneity across silos, however, presents a major challenge undermining the convergence speed and accuracy of the global model. This paper introduces Step-wise Momentum Fusion (SMoFi), an effective and lightweight framework that counteracts gradient divergence arising from data heterogeneity by synchronizing the momentum buffers across server-side optimizers. To control gradient divergence over the training process, we design a staleness-aware alignment mechanism that imposes constraints on gradient updates of the server-side submodel at each optimization step. Extensive validations on multiple real-world datasets show that SMoFi consistently improves global model accuracy (up to 7.1%) and convergence speed (up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data· underline

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques