
TL;DR
Compressed Counting introduces an efficient method for estimating the pth frequency moment in data streams, especially near p=1, using skewed stable projections to reduce sample complexity.
Contribution
It proposes a novel technique called skewed stable random projections for efficiently computing frequency moments in data streams with minimal sample complexity.
Findings
Sample complexity is O(1/ε) for p close to 1.
Applicable to Turnstile data streams with non-negative entries.
Provides algorithms for estimating logarithmic norms and distances.
Abstract
Counting is among the most fundamental operations in computing. For example, counting the pth frequency moment has been a very active area of research, in theoretical computer science, databases, and data mining. When p=1, the task (i.e., counting the sum) can be accomplished using a simple counter. Compressed Counting (CC) is proposed for efficiently computing the pth frequency moment of a data stream signal A_t, where 0<p<=2. CC is applicable if the streaming data follow the Turnstile model, with the restriction that at the time t for the evaluation, A_t[i]>= 0, which includes the strict Turnstile model as a special case. For natural data streams encountered in practice, this restriction is minor. The underly technique for CC is what we call skewed stable random projections, which captures the intuition that, when p=1 a simple counter suffices, and when p = 1+/\Delta with small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Stream Mining Techniques · Gaussian Processes and Bayesian Inference
