MergePipe: A Budget-Aware Parameter Management System for Scalable LLM Merging
Yuanyi Wang, Yanggan Gu, Zihao Wang, Kunxi Li, Yifan Yang, Zhaoyi Yan, Congkai Xie, Jianmin Wu, Hongxia Yang

TL;DR
MergePipe is a novel system that manages LLM merging efficiently by optimizing parameter I/O within user-defined budgets, significantly reducing disk operations and accelerating merging processes.
Contribution
It introduces a catalog-driven, cost-aware management system for scalable LLM merging, addressing I/O bottlenecks and improving efficiency.
Findings
Reduces total I/O by up to 10x
Achieves up to 11x speedup in merging tasks
Cuts wall-time by up to 90%
Abstract
Large language model (LLM) merging has become a key technique in modern LLM development pipelines, enabling the integration of multiple task- or domain-specific expert models without retraining. However, as the number of experts grows, existing merging implementations treat model parameters as unstructured files and execute merges in a stateless, one-shot manner, leading to excessive disk I/O, redundant parameter scans, and poor scalability. In this paper, we present \textbf{MergePipe}, a parameter management system for scalable LLM merging. MergePipe is the first system that treats LLM merging as a data management and execution problem, and introduces a catalog-driven abstraction over model parameters, merge plans, and execution lineage. At its core, MergePipe employs a cost-aware planner that explicitly models expert parameter I/O and enforces user-specified I/O budgets, followed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Natural Language Processing Techniques · Topic Modeling
