Memory-Efficient Split Federated Learning for LLM Fine-Tuning on Heterogeneous Mobile Devices
Xiaopei Chen, Liang Li, Fei Ji, Wen Wu

TL;DR
This paper introduces a memory-efficient split federated learning framework for fine-tuning large language models on diverse mobile devices, reducing memory usage and training time while maintaining performance.
Contribution
It proposes a novel edge-assisted split federated learning approach with low-rank adaptation and server-side scheduling to optimize LLM fine-tuning on heterogeneous devices.
Findings
Reduces memory footprint by 79%
Decreases training time by 6%
Achieves comparable performance to baselines
Abstract
In this paper, we propose an edge-assisted split federated learning framework to facilitate large language model (LLM) fine-tuning on heterogeneous mobile devices while alleviating memory pressures on both mobile devices and the edge server. Specifically, mobile devices perform low-rank adaptation (LoRA) fine-tuning on only a subset of lower layers of the pre-trained LLM, tailored to their individual capacities. On the server, a full LLM is maintained, and the corresponding LoRA modules are selectively fine-tuned in a sequential manner for each device. To further enhance training efficiency, we propose a server-side training scheduling method that optimizes the processing order of devices for accelerating fine-tuning. Extensive experiments demonstrate that compared to the baselines, our scheme can reduce 79\% memory footprint and 6\% training time while achieving comparable performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cooperative Communication and Network Coding · Stochastic Gradient Optimization Techniques
