LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure
Jaehong Cho, Hyunmin Choi, Guseul Heo, Jongse Park

TL;DR
LLMServingSim 2.0 is a comprehensive, runtime-driven simulator that models the complex interactions in heterogeneous and disaggregated LLM serving systems, aiding system design and optimization.
Contribution
It introduces a unified, extensible simulation framework that captures hardware-software interactions in LLM serving, supporting diverse accelerators and memory systems.
Findings
Accurately reproduces key performance metrics with less than 1% error.
Maintains practical simulation times of around 10 minutes.
Enables systematic exploration of hardware-software co-design for LLM serving.
Abstract
Large language model (LLM) serving infrastructures are undergoing a shift toward heterogeneity and disaggregation. Modern deployments increasingly integrate diverse accelerators and near-memory processing technologies, introducing significant hardware heterogeneity, while system software increasingly separates computation, memory, and model components across distributed resources to improve scalability and efficiency. As a result, LLM serving performance is no longer determined by hardware or software choices in isolation, but by their runtime interaction through scheduling, data movement, and interconnect behavior. However, understanding these interactions remains challenging, as existing simulators lack the ability to jointly model heterogeneous hardware and disaggregated serving techniques within a unified, runtime-driven framework. This paper presents LLMServingSim 2.0, a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Big Data and Digital Economy
