A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition

Prerak Srivastava; Giulio Corallo; Sergey Rybalko

arXiv:2506.01147·cs.CL·June 3, 2025

A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition

Prerak Srivastava, Giulio Corallo, Sergey Rybalko

PDF

Open Access

TL;DR

This paper introduces a novel character-level neural log parser that uses binary-coded decimal recognition to extract highly detailed log templates efficiently, matching large language models in accuracy but with less resource consumption.

Contribution

It presents a new neural architecture for log parsing that captures fine-grained details using binary-coded decimals, improving accuracy and efficiency over existing parsers.

Findings

01

Matches LLM-based parsers in accuracy

02

Outperforms semantic parsers in efficiency

03

Effective on industrial and benchmark datasets

Abstract

System-generated logs are typically converted into categorical log templates through parsing. These templates are crucial for generating actionable insights in various downstream tasks. However, existing parsers often fail to capture fine-grained template details, leading to suboptimal accuracy and reduced utility in downstream tasks requiring precise pattern identification. We propose a character-level log parser utilizing a novel neural architecture that aggregates character embeddings. Our approach estimates a sequence of binary-coded decimals to achieve highly granular log templates extraction. Our low-resource character-level parser, tested on revised Loghub-2k and a manually annotated industrial dataset, matches LLM-based parsers in accuracy while outperforming semantic parsers in efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques