Optimization of Model Splitting, Placement, and Chaining for Multi-hop Split Learning and Inference
Takanori Hara, Masahiro Sasabe

TL;DR
This paper introduces an ILP-based optimization framework and a heuristic algorithm for efficient model splitting, placement, and routing in multi-hop split learning and inference, reducing latency.
Contribution
It formulates a joint optimization model and proposes a heuristic algorithm to improve multi-hop split learning and inference performance.
Findings
The ILP model effectively minimizes end-to-end latency.
The BCD heuristic provides efficient solutions for complex optimization.
Evaluations show significant latency reductions in multi-hop scenarios.
Abstract
Service Function Chaining (SFC) establishes efficient communication paths by ensuring that traffic traverses a predefined sequence of network functions in a specified order to meet particular service requirements. Inspired by this concept, we have proposed an SFC-based architecture for multi-hop split learning (MSL) and split inference (MSI), facilitating distributed AI applications to effectively route smashed data across multi-hop networks. However, the multi-hop environment presents new challenges, including (1) determining optimal cut points, (2) deploying split sub-models on appropriate computing nodes, and (3) routing smashed data through the underlying communication networks while adhering to service requirements. To address these challenges, we formulate an Integer Linear Programming (ILP) model to jointly optimize model splitting, placement, and chaining (data routing) in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
