Tracezip: Efficient Distributed Tracing via Trace Compression
Zhuangbin Chen, Junsong Pu, and Zibin Zheng

TL;DR
Tracezip introduces a novel trace compression method using a Span Retrieval Tree to reduce storage and transmission overhead in distributed tracing, maintaining trace completeness with minimal performance impact.
Contribution
The paper presents Tracezip, a new trace compression technique that leverages trace redundancy to improve efficiency in distributed tracing systems.
Findings
Achieves significant performance gains in trace collection.
Maintains trace completeness with negligible overhead.
Compatible with existing tracing APIs through OpenTelemetry integration.
Abstract
Distributed tracing serves as a fundamental building block in the monitoring and testing of cloud service systems. To reduce computational and storage overheads, the de facto practice is to capture fewer traces via sampling. However, existing work faces a trade-off between the completeness of tracing and system overhead. On one hand, head-based sampling indiscriminately selects requests to trace when they enter the system, which may miss critical events. On the other hand, tail-based sampling first captures all requests and then selectively persists the edge-case traces, which entails the overheads related to trace collection and ingestion. Taking a different path, we propose Tracezip in this paper to enhance the efficiency of distributed tracing via trace compression. Our key insight is that there exists significant redundancy among traces, which results in repetitive transmission of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Privacy-Preserving Technologies in Data
