Gleaner: A Semantically-Rich and Efficient Online Sampler for Microservice Diagnostics
Yifan Yang (1), Aoyang FANG (1), Songhan Zhang (1), Pinjia He (1) ((1) The Chinese University of Hong Kong, Shenzhen)

TL;DR
Gleaner is an efficient online trace sampler for microservice diagnostics that improves fidelity and RCA accuracy by representing traces as set-based edge bags, enabling fast processing and better anomaly detection.
Contribution
Gleaner introduces a novel set-based trace representation and sampling strategy that replaces graph analysis, enabling high-fidelity, real-time trace sampling for microservice diagnostics.
Findings
Gleaner processes traces in 0.74ms each, enabling real-time sampling.
It improves trace pattern coverage by up to 128.7%.
At 1% sampling rate, it boosts RCA accuracy by 42%-107%.
Abstract
Distributed tracing in microservices is critical for diagnostics but generates overwhelming data volumes, necessitating intelligent sampling. To maximize fidelity, state-of-the-art (SOTA) tail-based samplers analyze complete (or even log-enriched) traces by modeling them as graphs. However, this reliance on computationally expensive graph analysis creates a performance bottleneck that prohibits their use in online settings. To this end, we propose Gleaner, an online tail-sampling framework that breaks this trade-off. It is founded on the key insight that explicit graph structures are unnecessary for high-fidelity trace grouping. Instead, Gleaner represents each trace as a "bag-of-edges" augmented with log semantics, replacing slow graph algorithms with highly efficient set-based operations. It also employs an alarm-driven quota and a diversity-preserving strategy to prioritize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
