Plug it and Play on Logs: A Configuration-Free Statistic-Based Log Parser
Qiaolin Qin, Xingfang Wu, Heng Li, Ettore Merlo

TL;DR
This paper introduces PIPLUP, a novel statistic-based log parser that is configuration-free, highly accurate, and generalizable, challenging the belief that statistic-based methods are inferior to semantic-based approaches.
Contribution
PIPLUP eliminates the need for pre-assumed token positions and uses data-insensitive parameters, enabling plug-and-play log parsing with high accuracy and efficiency.
Findings
PIPLUP outperforms state-of-the-art statistic-based parsers like Drain.
PIPLUP achieves competitive accuracy compared to semantic-based parsers like LUNAR.
PIPLUP demonstrates low time consumption without GPU or external APIs.
Abstract
Log parsing is an essential task in log analysis, and many tools have been designed to accomplish it. Existing log parsers can be categorized into statistic-based and semantic-based approaches. In comparison to semantic-based parsers, existing statistic-based parsers tend to be more efficient, require lower computational costs, and be more privacy-preserving thanks to on-premise deployment, but often fall short in their accuracy (e.g., grouping or parsing accuracy) and generalizability. Therefore, it became a common belief that statistic-based parsers cannot be as effective as semantic-based parsers since the latter could take advantage of external knowledge supported by pretrained language models. Our work, however, challenges this belief with a novel statistic-based parser, PIPLUP. PIPLUP eliminates the pre-assumption of the position of constant tokens for log grouping and relies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
