Approximate counting with a floating-point counter

Miklos Csuros

arXiv:0904.3062·cs.DS·August 24, 2009

Approximate counting with a floating-point counter

Miklos Csuros

PDF

Open Access

TL;DR

This paper introduces a floating-point probabilistic counter that efficiently estimates large counts using minimal memory, improving upon Morris's original approximate counting method with a new unbiased estimator and detailed performance analysis.

Contribution

It presents a novel floating-point counter design with a simple unbiased estimator, extending Morris's approximate counting technique with enhanced accuracy and practical formulas for performance assessment.

Findings

01

Uses d + log log n bits for counting

02

Achieves an unbiased estimate with standard deviation ~0.6 * n * 2^{-d/2}

03

Provides a general performance analysis framework

Abstract

Memory becomes a limiting factor in contemporary applications, such as analyses of the Webgraph and molecular sequences, when many objects need to be counted simultaneously. Robert Morris [Communications of the ACM, 21:840--842, 1978] proposed a probabilistic technique for approximate counting that is extremely space-efficient. The basic idea is to increment a counter containing the value $X$ with probability $2^{- X}$ . As a result, the counter contains an approximation of $l g n$ after $n$ probabilistic updates stored in $l g l g n$ bits. Here we revisit the original idea of Morris, and introduce a binary floating-point counter that uses a $d$ -bit significand in conjunction with a binary exponent. The counter yields a simple formula for an unbiased estimation of $n$ with a standard deviation of about $0.6 \cdot n 2^{- d /2}$ , and uses $d + l g l g n$ bits. We analyze the floating-point…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Machine Learning and Algorithms