FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

Srijith Nair; Michael Lin; Peizhong Ju; Amirreza Talebi; Elizabeth Serena Bentley; Jia Liu

arXiv:2505.23182·cs.LG·June 18, 2025

FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

Srijith Nair, Michael Lin, Peizhong Ju, Amirreza Talebi, Elizabeth Serena Bentley, Jia Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

FSL-SAGE introduces a federated split learning algorithm that estimates server gradients using auxiliary models, reducing communication and memory costs while maintaining convergence and improving accuracy over existing methods.

Contribution

It proposes a novel gradient estimation technique for federated split learning that enhances efficiency and accuracy compared to prior approaches.

Findings

01

Achieves convergence rate of O(1/√T) matching FedAvg.

02

Reduces communication costs significantly.

03

Outperforms existing state-of-the-art FSL methods in accuracy.

Abstract

Collaborative training methods like Federated Learning (FL) and Split Learning (SL) enable distributed machine learning without sharing raw data. However, FL assumes clients can train entire models, which is infeasible for large-scale models. In contrast, while SL alleviates the client memory constraint in FL by offloading most training to the server, it increases network latency due to its sequential nature. Other methods address the conundrum by using local loss functions for parallel client-side training to improve efficiency, but they lack server feedback and potentially suffer poor accuracy. We propose FSL-SAGE (Federated Split Learning via Smashed Activation Gradient Estimation), a new federated split learning algorithm that estimates server-side gradient feedback via auxiliary models. These auxiliary models periodically adapt to emulate server behavior on local datasets. We show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srijith1996/FSL-SAGE
pytorchOfficial

Videos

FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · Mobile Crowdsensing and Crowdsourcing