SQUAD: Combining Sketching and Sampling Is Better than Either for Per-item Quantile Estimation
Rana Shahout, Roy Friedman, Ran Ben Basat

TL;DR
This paper introduces SQUAD, a novel method that combines sketching and sampling techniques to efficiently estimate per-item quantiles in data streams, outperforming existing methods in space complexity and accuracy.
Contribution
The paper proposes a new algorithm, SQUAD, that integrates sampling with sketching for per-item quantile estimation, providing deterministic error guarantees and improved space efficiency.
Findings
SQUAD outperforms existing methods in space complexity.
The combined approach improves accuracy in quantile estimation.
Extensive simulations validate the effectiveness of SQUAD.
Abstract
Stream monitoring is fundamental in many data stream applications, such as financial data trackers, security, anomaly detection, and load balancing. In that respect, quantiles are of particular interest, as they often capture the user's utility. For example, if a video connection has high tail latency, the perceived quality will suffer, even if the average and median latencies are low. In this work, we consider the problem of approximating the per-item quantiles. Elements in our stream are (ID, latency) tuples, and we wish to track the latency quantiles for each ID. Existing quantile sketches are designed for a single number stream (e.g., containing just the latency). While one could allocate a separate sketch instance for each ID, this may require an infeasible amount of memory. Instead, we consider tracking the quantiles for the heavy hitters (most frequent items), which are often…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Anomaly Detection Techniques and Applications · Time Series Analysis and Forecasting
