Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference

Mengtian Yang; Zhekun Zhang; Mingheng Wu; Jianwen Yan; Hanshi Sun; Li-wen Chang

arXiv:2605.17164·cs.DC·May 21, 2026

Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference

Mengtian Yang, Zhekun Zhang, Mingheng Wu, Jianwen Yan, Hanshi Sun, Li-wen Chang

PDF

TL;DR

Charon is a modular, fine-grained simulator that accurately predicts large language model performance, aiding optimization and system design with errors under 5.35%.

Contribution

It introduces a unified, detailed simulation framework that improves performance prediction accuracy for large-scale LLM training and inference.

Findings

01

Achieves prediction errors under 5.35% across models.

02

Discovered a configuration that improves throughput over baseline.

03

Demonstrates practical value in real-world deployment.

Abstract

Deploying large-scale LLM training and inference with optimal performance is exceptionally challenging due to a complex design space of parallelism strategies, system optimizations, and hardware configurations. Accurate and rapid performance simulation is critical for guiding optimization efforts and system studies by validating "what-if" Hooker Figure hypotheses. To address this, we introduce Charon, a unified, modular, and fine-grained simulator for accurately predicting LLM performance. Experiments show Charon achieves high accuracy across different models and configurations, with an overall prediction error consistently under 5.35%, and even under 3.74% for training with a large-scale GPU cluster. In a practical inference deployment case, Charon discovered a configuration that improved system throughput over an engineering-tuned baseline, demonstrating its significant real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.