UniParser: A Unified Log Parser for Heterogeneous Log Data
Yudong Liu, Xu Zhang, Shilin He, Hongyu Zhang, Liqun Li, Yu Kang, Yong, Xu, Minghua Ma, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang

TL;DR
UniParser is a novel log parser that captures common behaviors across diverse log data by learning patterns through token and context encoding, significantly improving parsing accuracy over existing methods.
Contribution
UniParser introduces a unified approach with specialized modules to effectively parse heterogeneous logs, addressing diversity and semantic understanding challenges.
Findings
Outperforms state-of-the-art log parsers on 16 datasets
Effectively captures semantic meanings in log messages
Demonstrates robustness across diverse log sources
Abstract
Logs provide first-hand information for engineers to diagnose failures in large-scale online service systems. Log parsing, which transforms semi-structured raw log messages into structured data, is a prerequisite of automated log analysis such as log-based anomaly detection and diagnosis. Almost all existing log parsers follow the general idea of extracting the common part as templates and the dynamic part as parameters. However, these log parsing methods, often neglect the semantic meaning of log messages. Furthermore, high diversity among various log sources also poses an obstacle in the generalization of log parsing across different systems. In this paper, we propose UniParser to capture the common logging behaviours from heterogeneous log data. UniParser utilizes a Token Encoder module and a Context Encoder module to learn the patterns from the log token and its neighbouring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
