Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients
Yan Li, Xiao Zhang, Mingyi Li, Guangwei Xu, Feng Chen, Yuan Yuan, Yifei Zou, Mengying Zhao, Jianbo Lu, and Dongxiao Yu

TL;DR
This paper introduces ${Co-S}^2{P}$, a semi-asynchronous framework for large-scale model training on resource-limited devices, combining structured pruning and knowledge transfer to enhance efficiency and accuracy.
Contribution
It presents the first semi-asynchronous collaborative training method addressing unstructured pruning, varying architectures, and straggler issues with theoretical convergence guarantees.
Findings
Achieves up to 8.8% accuracy improvement
Reduces memory consumption by 22%
Speeds up training time by 24%
Abstract
In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets. In order to improve both efficiency and accuracy in resource-adaptive collaborative learning, we take the first step to consider the \textit{unstructured pruning}, \textit{varying submodel architectures}, \textit{knowledge loss}, and \textit{straggler} challenges simultaneously. We propose a novel semi-asynchronous collaborative training framework, namely , with data distribution-aware structured pruning and cross-block knowledge transfer mechanism to address the above concerns. Furthermore, we provide theoretical proof that can achieve asymptotic optimal convergence rate of . Finally, we conduct extensive experiments on two types of tasks with a real-world hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Scientific Computing and Data Management · Business Process Modeling and Analysis
MethodsPruning
