MSfusion: A Dynamic Model Splitting Approach for Resource-Constrained Machines to Collaboratively Train Larger Models
Jin Xie, Songze Li

TL;DR
MSfusion is a collaborative learning framework that enables resource-constrained devices to train larger models efficiently by splitting models and using adaptive strategies to maintain training effectiveness.
Contribution
The paper introduces MSfusion, a novel model splitting approach with adaptive overlapping and contrastive loss, improving large model training on resource-limited devices.
Findings
Significant reduction in computation and communication costs.
Enhanced training effectiveness through adaptive model overlapping.
Scalability with increasing number of participants.
Abstract
Training large models requires a large amount of data, as well as abundant computation resources. While collaborative learning (e.g., federated learning) provides a promising paradigm to harness collective data from many participants, training large models remains a major challenge for participants with limited resources like mobile devices. We introduce MSfusion, an effective and efficient collaborative learning framework, tailored for training larger models on resourceconstraint machines through model splitting. Specifically, a double shifting model splitting scheme is designed such that in each training round, each participant is assigned a subset of model parameters to train over local data, and aggregates with sub-models of other peers on common parameters. While model splitting significantly reduces the computation and communication costs of individual participants, additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods · Simulation Techniques and Applications · Scientific Computing and Data Management
