InferLog: Accelerating LLM Inference for Online Log Parsing via ICL-oriented Prefix Caching

Yilun Wang; Pengfei Chen; Haiyu Huang; Zilong He; Gou Tan; Chuanfu Zhang; Jingkai He; Zibin Zheng

arXiv:2507.08523·cs.SE·September 17, 2025

InferLog: Accelerating LLM Inference for Online Log Parsing via ICL-oriented Prefix Caching

Yilun Wang, Pengfei Chen, Haiyu Huang, Zilong He, Gou Tan, Chuanfu Zhang, Jingkai He, Zibin Zheng

PDF

1 Repo

TL;DR

InferLog is a novel method that significantly accelerates LLM inference for online log parsing by optimizing prefix caching and configuration tuning, enabling faster and more efficient log analysis in high-volume environments.

Contribution

InferLog introduces the first LLM inference optimization approach for online log parsing, focusing on accelerating inference without sacrificing accuracy.

Findings

01

InferLog achieves significant speedup over existing methods.

02

It maintains high parsing accuracy while improving inference efficiency.

03

Experimental results validate its effectiveness on real log datasets.

Abstract

Modern software systems generate massive volumes of runtime logs, necessitating efficient and accurate log parsing to enable critical downstream tasks such as anomaly detection and root cause analysis. Recently, large language models (LLMs) have achieved advanced accuracy on log parsing, but their deployment in production environments faces two major limitations: (1) the privacy risks associated with commercial LLMs, driving the adoption of local deployment, and (2) the stringent latency and throughput requirements imposed by high-volume log streams, which existing LLM-based parsers fail to meet. Although recent efforts have reduced the number of LLM queries, they overlook the high latency of the LLM invocations, where concurrent log parsing requests can cause serve performance degradation of LLM inference system. In this study, we present InferLog, the first LLM inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wiluen/inferlog
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.