LogPrism: Unifying Structure and Variable Encoding for Effective Log Compression
Yang Liu, Kaiming Zhang, Zhuangbin Chen, Zibin Zheng

TL;DR
LogPrism introduces a unified framework that combines structural and variable encoding for log compression, significantly improving compression ratios and processing speed by capturing deep redundancies without relying on rigid pre-parsing.
Contribution
It proposes a novel hierarchical approach using a Unified Redundancy Tree to dynamically encode structure and variables jointly, surpassing existing methods in efficiency and effectiveness.
Findings
Achieves highest compression ratio on 14 out of 16 datasets.
Surpasses baselines by 6.12% to 83.34% in compression.
Delivers throughput of 29.87 MB/s, 1.68× to 43.04× faster than competitors.
Abstract
In the field of log compression, the prevailing "parse-then-compress" paradigm fundamentally limits effectiveness by treating log parsing and compression as isolated objectives. While parsers prioritize semantic accuracy (i.e., event identification), they often obscure deep correlations between static templates and dynamic variables that are critical for storage efficiency. In this paper, we investigate this misalignment through a comprehensive empirical study and propose LogPrism, a framework that bridges the gap via unified redundancy encoding. Rather than relying on a rigid pre-parsing step, LogPrism dynamically integrates structural extraction with variable encoding by constructing a Unified Redundancy Tree (URT). This hierarchical approach effectively mines "structure+variable" co-occurrence patterns, capturing deep contextual redundancies while accelerating processing through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Advanced Data Storage Technologies · Advanced Database Systems and Queries
