SQUID: Faster Analytics via Sampled Quantile Estimation
Ran Ben-Basat, Gil Einziger, Wenchen Han, Bilal Tayh

TL;DR
SQUID introduces a sampling-based quantile estimation method that accelerates large-scale streaming analytics, improves heavy hitter detection accuracy, and enhances network caching policies with practical hardware implementation.
Contribution
The paper presents SQUID, a novel sampling approach for quantile estimation that improves speed and accuracy in streaming analytics and enables advanced caching policies in hardware.
Findings
Up to 6.6x faster software performance compared to state-of-the-art methods.
Enhanced accuracy in weighted heavy hitter detection.
Achieves higher cache hit ratios than existing policies in hardware experiments.
Abstract
Streaming algorithms are fundamental in the analysis of large and online datasets. A key component of many such analytic tasks is -MAX, which finds the largest values in a number stream. Modern approaches attain a constant runtime by removing small items in bulk and retaining the largest items at all times. Yet, these approaches are bottlenecked by an expensive quantile calculation. This work introduces a quantile-sampling approach called SQUID and shows its benefits in multiple analytic tasks. Using this approach, we design a novel weighted heavy hitters data structure that is faster and more accurate than the existing alternatives. We also show SQUID's practicality for improving network-assisted caching systems with a hardware-based cache prototype that uses SQUID to implement the cache policy. The challenge here is that the switch's dataplane does not allow the general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Network Packet Processing and Optimization · Cloud Computing and Resource Management
