Count-Min Sketch with Conservative Updates: Worst-Case Analysis
Younes Ben Mazziane, Othmane Marfoq

TL;DR
This paper provides a worst-case analysis of Count-Min Sketch with Conservative Updates, revealing its error behavior and bounds, especially when each item appears at most once, and introduces novel theoretical bounds involving Markov processes.
Contribution
It introduces new bounds on estimation error for CMS-CU, analyzes its worst-case behavior, and compares it with vanilla Count-Min Sketch, including convergence results for specific parameters.
Findings
Average estimation error converges to 0.5 when d=m-1.
Bounds on estimation error are tight for small g values.
For large m, bounds coincide when d=m-1 and g=1.
Abstract
Count-Min Sketch with Conservative Updates (CMS-CU) is a memory-efficient hash-based data structure used to estimate the occurrences of items within a data stream. CMS-CU stores counters and employs hash functions to map items to these counters. We first argue that the estimation error in CMS-CU is maximal when each item appears at most once in the stream. Next, we study CMS-CU in this setting. In the case where , we prove that the average estimation error and the average counter rate converge almost surely to , contrasting with the vanilla Count-Min Sketch, where the average counter rate is equal to . For any given and , we prove novel lower and upper bounds on the average estimation error, incorporating a positive integer parameter . Larger values of this parameter improve the accuracy of the bounds. Moreover, the computation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Algorithms and Data Compression
