LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems

Siyu Wu; Yulong Ye; Zezhen Xiang; Pengzhou Chen; Gangda Xiong; and Tao Chen

arXiv:2605.08305·cs.LG·May 12, 2026

LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems

Siyu Wu, Yulong Ye, Zezhen Xiang, Pengzhou Chen, Gangda Xiong, and Tao Chen

PDF

1 Repo

TL;DR

This paper introduces LLMSYS-HPOBench, a comprehensive benchmark suite for hyperparameter optimization of real-world LLM systems, capturing complex configurations, fidelity factors, and diverse metrics.

Contribution

It provides the first live benchmark dataset for HPO of LLM systems, enabling validation and development of new optimization algorithms.

Findings

01

Contains 364,450 hyperparameter configurations with detailed metrics.

02

Profiles 3-5 fidelity factors and 3-9 inference objective metrics.

03

Available at https://github.com/ideas-labo/llmsys-hpobench.

Abstract

Large Language Model (LLM) systems have been the frontier of AI in many application domains, leading to new challenges and opportunities for hyperparameter optimization (HPO) for the AutoML community. However, this type of system exhibits an unprecedented compound space of hyperparameter configuration from both the AI and non-AI components; rich and nonlinear implications from the fidelity factors; and diverse costs of measuring hyperparameter configurations, none of which have been fully captured in existing benchmarks. This paper presents the first (live) benchmark suite and datasets for HPO of real-world LLM systems, dubbed LLMSYS-HPOBench, covering data related to the inference objective values of hyperparameter configurations profiled from running the LLM systems. Currently, LLMSYS-HPOBench contains 364,450 hyperparameter configurations with a dimensionality of 12-23, 3-5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ideas-labo/llmsys-hpobench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.