On Approximating Frequency Moments of Data Streams with Skewed Projections
Ping Li

TL;DR
This paper introduces skewed stable random projections for more accurate approximation of frequency moments in data streams, especially when p is close to 1, improving upon previous symmetric methods.
Contribution
First proposal of skewed stable random projections with new estimators and theoretical bounds for frequency moment approximation in data streams.
Findings
Significant improvement over previous methods for p near 1
Applicable to various data stream models including insertion-only and non-negative streams
Provides statistical estimators with variance and error bounds
Abstract
We propose skewed stable random projections for approximating the pth frequency moments of dynamic data streams (0<p<=2), which has been frequently studied in theoretical computer science and database communities. Our method significantly (or even infinitely when p->1) improves previous methods based on (symmetric) stable random projections. Our proposed method is applicable to data streams that are (a) insertion only (the cash-register model); or (b) always non-negative (the strict Turnstile model), or (c) eventually non-negative at check points. This is only a minor restriction for practical applications. Our method works particularly well when p = 1+/- \Delta and \Delta is small, which is a practically important scenario. For example, \Delta may be the decay rate or interest rate, which are usually small. Of course, when \Delta = 0, one can compute the 1th frequent moment (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Peer-to-Peer Network Technologies
