HOSL: Hybrid-Order Split Learning for Memory-Constrained Edge Training
Aakriti Lnu, Zhe Li, Dandan Liang, Chao Huang, Rui Li, Haibo Yang

TL;DR
HOSL introduces a hybrid split learning framework combining zeroth-order and first-order optimization to enable memory-efficient training of large language models on resource-constrained edge devices, maintaining high accuracy.
Contribution
This work presents the first hybrid approach that strategically integrates ZO and FO optimization in split learning, reducing memory usage while preserving convergence speed and model performance.
Findings
HOSL reduces client GPU memory by up to 3.7× compared to FO methods.
HOSL achieves accuracy within 0.20%-4.23% of FO baseline.
HOSL outperforms ZO baseline by up to 15.55%.
Abstract
Split learning (SL) enables collaborative training of large language models (LLMs) between resource-constrained edge devices and compute-rich servers by partitioning model computation across the network boundary. However, existing SL systems predominantly rely on first-order (FO) optimization, which requires clients to store intermediate quantities such as activations for backpropagation. This results in substantial memory overhead, largely negating benefits of model partitioning. In contrast, zeroth-order (ZO) optimization eliminates backpropagation and significantly reduces memory usage, but often suffers from slow convergence and degraded performance. In this work, we propose HOSL, a novel Hybrid-Order Split Learning framework that addresses this fundamental trade-off between memory efficiency and optimization effectiveness by strategically integrating ZO optimization on the client…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
