LogLog-Beta and More: A New Algorithm for Cardinality Estimation Based on LogLog Counting
Jason Qin, Denys Kim, Yumei Tung

TL;DR
LogLog-Beta is a simplified, efficient cardinality estimation algorithm based on LogLog counting that requires only one formula and achieves accuracy comparable to or better than HyperLogLog variants.
Contribution
Introduces LogLog-Beta, a new single-formula algorithm for cardinality estimation that improves efficiency and accuracy over existing methods like HyperLogLog.
Findings
LogLog-Beta matches or exceeds HyperLogLog accuracy.
The algorithm is simpler and more efficient to implement.
Provides an additional estimator based on order statistics.
Abstract
The information presented in this paper defines LogLog-Beta. LogLog-Beta is a new algorithm for estimating cardinalities based on LogLog counting. The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities, therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++. In addition to LogLog-Beta we also provide another one-formula estimator for cardinalities based on order statistics, a modification of an algorithm developed by Lumbroso.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Time Series Analysis and Forecasting
