FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation
Srijith Nair, Michael Lin, Peizhong Ju, Amirreza Talebi, Elizabeth Serena Bentley, Jia Liu

TL;DR
FSL-SAGE introduces a federated split learning algorithm that estimates server gradients using auxiliary models, reducing communication and memory costs while maintaining convergence and improving accuracy over existing methods.
Contribution
It proposes a novel gradient estimation technique for federated split learning that enhances efficiency and accuracy compared to prior approaches.
Findings
Achieves convergence rate of O(1/√T) matching FedAvg.
Reduces communication costs significantly.
Outperforms existing state-of-the-art FSL methods in accuracy.
Abstract
Collaborative training methods like Federated Learning (FL) and Split Learning (SL) enable distributed machine learning without sharing raw data. However, FL assumes clients can train entire models, which is infeasible for large-scale models. In contrast, while SL alleviates the client memory constraint in FL by offloading most training to the server, it increases network latency due to its sequential nature. Other methods address the conundrum by using local loss functions for parallel client-side training to improve efficiency, but they lack server feedback and potentially suffer poor accuracy. We propose FSL-SAGE (Federated Split Learning via Smashed Activation Gradient Estimation), a new federated split learning algorithm that estimates server-side gradient feedback via auxiliary models. These auxiliary models periodically adapt to emulate server behavior on local datasets. We show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · Mobile Crowdsensing and Crowdsourcing
