Optimized Log Parsing with Syntactic Modifications
Nafid Enan, Gias Uddin

TL;DR
This paper empirically compares various log parsing techniques, revealing trade-offs in accuracy and efficiency, and introduces SynLog+ to significantly enhance parsing accuracy with minimal runtime impact.
Contribution
It provides a comprehensive evaluation of syntax- and semantic-based log parsers and proposes SynLog+ to improve template identification accuracy in two-phase architectures.
Findings
Semantic methods outperform in template identification.
Syntax-based parsers are 10 to 1,000 times more efficient.
Two-phase architecture improves accuracy over single-phase.
Abstract
Logs provide valuable insights into system runtime and assist in software development and maintenance. Log parsing, which converts semi-structured log data into structured log data, is often the first step in automated log analysis. Given the wide range of log parsers utilizing diverse techniques, it is essential to evaluate them to understand their characteristics and performance. In this paper, we conduct a comprehensive empirical study comparing syntax- and semantic-based log parsers, as well as single-phase and two-phase parsing architectures. Our experiments reveal that semantic-based methods perform better at identifying the correct templates and syntax-based log parsers are 10 to 1,000 times more efficient and provide better grouping accuracy although they fall short in accurate template identification. Moreover, two-phase architecture consistently improves accuracy compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
