An Iconic Heavy Hitter Algorithm Made Private
Rayne Holland

TL;DR
This paper introduces a differentially private version of the SpaceSaving algorithm for heavy hitter detection in data streams, achieving strong privacy guarantees while maintaining high utility and efficiency.
Contribution
It presents the first private adaptation of SpaceSaving, a novel generic method for private heavy hitter extraction from frequency oracles, and demonstrates superior empirical performance.
Findings
Private SpaceSaving outperforms private Misra-Gries in utility.
The generic method efficiently extracts heavy hitters with minimal additional memory.
Experimental results confirm practical effectiveness under various privacy settings.
Abstract
Identifying heavy hitters in data streams is a fundamental problem with widespread applications in modern analytics systems. These streams are often derived from sensitive user activity, making update-level privacy guarantees necessary. While recent work has adapted the classical heavy hitter algorithm Misra-Gries to satisfy differential privacy in the streaming model, the privatization of other heavy hitter algorithms with better empirical utility is absent. Under this observation, we present the first differentially private variant of the SpaceSaving algorithm, which, in the non-private setting, is regarded as the state-of-the-art in practice. Our construction post-processes a non-private SpaceSaving summary by injecting asymptotically optimal noise and applying a carefully calibrated selection rule that suppresses unstable labels. This yields strong privacy guarantees while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Stream Mining Techniques · Imbalanced Data Classification Techniques
