LogFold: Compressing Logs with Structured Tokens and Hybrid Encoding
Shiwen Shan, Yintong Huo, Hongzhan Zhong, Zhining Wang, Yuxin Su, Zibin Zheng

TL;DR
LogFold introduces a novel log compression method that leverages structured token analysis and hybrid encoding to significantly outperform existing algorithms in compression ratio and speed.
Contribution
This paper presents LogFold, a new log compression approach that exploits redundancies in structured tokens and employs a fine-grained, type-aware encoding strategy.
Findings
Achieves 11.11% better compression ratios than state-of-the-art methods.
Operates at a speed of 9.842 MB/s on average.
Component ablation confirms the effectiveness of each part.
Abstract
Logs are essential for diagnosing failures and conducting retrospective studies, leading many software organizations to retain log messages for a long time. Nevertheless, the volume of generated log data grows rapidly as software systems grow, necessitating an effective compression method. Apart from general-purpose compressors (e.g., Gzip, Bzip2), many recent studies developed log-specific compression algorithms, but they offer suboptimal performance because of (1) overlooking redundancies within certain complex tokens, and (2) lacking a fine-grained encoding strategy for diverse token types. This work uncovers a new redundancy pattern in structured tokens and proposes a new type-aware encoding strategy to improve log compression. Building on this insight, we introduce LogFold, a novel log compression method consisting of four components: a token analyzer to classifies tokens as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Advanced Database Systems and Queries
