A First Look at Bugs in LLM Inference Engines

Mugeng Liu; Siqi Zhong; Weichen Bi; Yixuan Zhang; Zhiyang Chen; Zhenpeng Chen; Xuanzhe Liu; Yun Ma

arXiv:2506.09713·cs.SE·January 12, 2026

A First Look at Bugs in LLM Inference Engines

Mugeng Liu, Siqi Zhong, Weichen Bi, Yixuan Zhang, Zhiyang Chen, Zhenpeng Chen, Xuanzhe Liu, Yun Ma

PDF

Open Access 1 Repo

TL;DR

This paper presents the first systematic empirical analysis of bugs in LLM inference engines, revealing common bug types, root causes, and providing insights for improving their reliability and development.

Contribution

It introduces a comprehensive dataset of 929 real-world bugs in LLM inference engines and offers a taxonomy of bug symptoms and root causes, advancing understanding in this area.

Findings

01

Six bug symptom types identified

02

28 root causes categorized

03

Guidelines for bug detection and fixing proposed

Abstract

Large language model-specific inference engines (in short as \emph{LLM inference engines}) have become a fundamental component of modern AI infrastructure, enabling the deployment of LLM-powered applications (LLM apps) across cloud and local devices. Despite their critical role, LLM inference engines are prone to bugs due to the immense resource demands of LLMs and the complexities of cross-platform compatibility. However, a systematic understanding of these bugs remains lacking. To bridge this gap, we present the first empirical study on bugs in LLM inference engines. We mine official repositories of 5 widely adopted LLM inference engines, constructing a comprehensive dataset of 929 real-world bugs. Through a rigorous open coding process, we analyze these bugs to uncover their symptoms, root causes, commonality, fix effort, fix strategies, and temporal evolution. Our findings reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

infbug/bugs-in-llm-inference-engines
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification