Efficient Algorithm for Deterministic Search of Hot Elements

Dariusz R. Kowalski; Dominik Pajak

arXiv:2203.15043·cs.DS·March 30, 2022

Efficient Algorithm for Deterministic Search of Hot Elements

Dariusz R. Kowalski, Dominik Pajak

PDF

Open Access

TL;DR

This paper introduces a new deterministic online algorithm for identifying frequent elements in large data streams, achieving optimal memory and time efficiency without randomness or multiple passes.

Contribution

It presents the first truly online deterministic algorithm for frequent element detection with near-optimal scalability and memory usage, improving over prior randomized or multi-pass methods.

Findings

01

Uses $O( ext{min}(n, rac{ ext{polylog}(n)}{ ext{epsilon}}))$ memory

02

Operates in $O( ext{polylog}(n))$ time per element

03

Establishes a lower bound of $ ext{Omega}( ext{min}(n, rac{1}{ ext{epsilon}}))$ on memory requirements

Abstract

When facing a very large stream of data, it is often desirable to extract most important statistics online in a short time and using small memory. For example, one may want to quickly find the most influential users generating posts online or check if the stream contains many identical elements. In this paper, we study streams containing insertions and deletions of elements from a possibly large set $N$ of size $∣ N ∣ = n$ , that are being processed by online deterministic algorithms. At any point in the stream the algorithm may be queried to output elements of certain frequency in the already processed stream. More precisely, the most frequent elements in the stream so far. The output is considered correct if the returned elements it contains all elements with frequency greater than a given parameter $φ$ and no element with frequency smaller than $φ - ϵ$ . We present an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Complexity and Algorithms in Graphs