Straggler Identification in Round-Trip Data Streams via Newton's Identities and Invertible Bloom Filters
David Eppstein, Michael T. Goodrich

TL;DR
This paper presents deterministic and randomized algorithms for identifying remaining set members after many insertions and deletions in data streams, using Newton's identities and a novel invertible Bloom filter, with applications in network data management.
Contribution
It introduces a deterministic solution based on Newton's identities and a randomized invertible Bloom filter for efficient straggler identification in data streams.
Findings
Deterministic solution uses O(d log n) bits, no false deletions.
Randomized solution tolerates false deletions with O(d log n log(1/epsilon)) bits.
Lower bounds show limitations of small-space deterministic solutions.
Abstract
We introduce the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in o(n) space, where n is the total number of identities. The straggler identification problem has applications, for example, in determining the set of unacknowledged packets in a high-bandwidth multicast data stream. We provide a deterministic solution to the straggler identification problem that uses only O(d log n) bits and is based on a novel application of Newton's identities for symmetric polynomials. This solution can identify any subset of d stragglers from a set of n O(log n)-bit identifiers, assuming that there are no false deletions of identities not already in the set. Indeed, we give a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Internet Traffic Analysis and Secure E-voting · Covalent Organic Framework Applications
