Perfect $L_p$ Sampling in a Data Stream
Rajesh Jayaram, David P. Woodruff

TL;DR
This paper presents the first perfect $L_p$ sampling algorithm in a data stream for $p eq 2$, achieving optimal space complexity and derandomization, resolving longstanding open questions in streaming algorithms.
Contribution
It introduces a perfect $L_p$ sampler with optimal space complexity for $p eq 2$, and provides a general derandomization method for linear sketches.
Findings
Achieves $O( ext{log}^2 n ext{log} rac{1}{ ext{delta}})$ bits space for $p eq 2$
Matches prior bounds for $p=2$ without dependence on $ u$
Can be derandomized with only a $( ext{log} ext{log} n)^2$ space blow-up
Abstract
In this paper, we resolve the one-pass space complexity of sampling for . Given a stream of updates (insertions and deletions) to the coordinates of an underlying vector , a perfect sampler must output an index with probability , and is allowed to fail with some probability . So far, for no algorithm has been shown to solve the problem exactly using -bits of space. In 2010, Monemizadeh and Woodruff introduced an approximate sampler, which outputs with probability , using space polynomial in and . The space complexity was later reduced by Jowhari, Sa\u{g}lam, and Tardos to roughly for , which tightly matches the lower bound in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
