Better Sum Estimation via Weighted Sampling
Lorenzo Beretta, Jakub T\v{e}tek

TL;DR
This paper improves the bounds and simplicity of algorithms for estimating total weights in large sets using weighted sampling, addressing both proportional and hybrid sampling settings, and extends to unknown set sizes and graph edge counting.
Contribution
It provides tighter bounds and simpler algorithms for sum estimation in proportional and hybrid sampling settings, including unknown set sizes and applications to graph problems.
Findings
Improved sum estimation algorithms with bounds matching in both n and ε.
Extended techniques to unknown set size scenarios.
Applied methods to graph edge counting problems.
Abstract
Given a large set where each item has weight , we want to estimate the total weight to within factor of with some constant probability . Since is large, we want to do this without looking at the entire set . In the traditional setting in which we are allowed to sample elements from uniformly, sampling items is necessary to provide any non-trivial guarantee on the estimate. Therefore, we investigate this problem in different settings: in the \emph{proportional} setting we can sample items with probabilities proportional to their weights, and in the \emph{hybrid} setting we can sample both proportionally and uniformly. These settings have applications, for example, in sublinear-time algorithms and distribution testing. Sum estimation in the proportional and hybrid setting has been considered…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Complexity and Algorithms in Graphs · Advanced Bandit Algorithms Research
