Improved Frequency Estimation Algorithms with and without Predictions
Anders Aamand, Justin Y. Chen, Huy L\^e Nguyen, Sandeep Silwal, Ali, Vakilian

TL;DR
This paper introduces new frequency estimation algorithms for data streams that outperform existing methods both theoretically and empirically, with or without machine learning predictions, enhancing accuracy in large-scale data analysis.
Contribution
The paper presents a novel algorithm that surpasses previous learning-augmented methods in certain regimes without predictions and further improves accuracy when combined with heavy-hitter predictions.
Findings
Our algorithms outperform prior approaches in all experiments.
Theoretical analysis shows improved error bounds in some regimes.
Augmentation with predictions further reduces estimation error.
Abstract
Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al. (2019) introduced the idea of using machine learning to tailor sketching algorithms to the specific data distribution they are being run on. In particular, their learning-augmented frequency estimation algorithm uses a learned heavy-hitter oracle which predicts which elements will appear many times in the stream. We give a novel algorithm, which in some parameter regimes, already theoretically outperforms the learning based algorithm of Hsu et al. without the use of any predictions. Augmenting our algorithm with heavy-hitter predictions further reduces the error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Stream Mining Techniques · Music and Audio Processing · Time Series Analysis and Forecasting
