Bias Reduction for Sum Estimation

Talya Eden; Jakob B{\ae}k Tejs Houen; Shyam Narayanan; Will Rosenbaum,; Jakub T\v{e}tek

arXiv:2208.01197·cs.DS·August 3, 2022

Bias Reduction for Sum Estimation

Talya Eden, Jakob B{\ae}k Tejs Houen, Shyam Narayanan, Will Rosenbaum,, Jakub T\v{e}tek

PDF

Open Access

TL;DR

This paper introduces a bias-reduced estimator for sum estimation under noisy sampling distributions, analyzing its theoretical properties and optimal sample complexity in the presence of distribution perturbations.

Contribution

It proposes a family of estimators with bias control for sum estimation when sampling from distributions close to a known distribution, extending classical methods.

Findings

01

The estimator's bias is proportional to b^k, allowing bias reduction.

02

Sample complexity depends on b and b, with optimal bounds established.

03

Sample complexity varies non-uniformly with the error parameter b.

Abstract

In classical statistics and distribution testing, it is often assumed that elements can be sampled from some distribution $P$ , and that when an element $x$ is sampled, the probability $P$ of sampling $x$ is also known. Recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution $Q$ that is sufficiently close to $P$ . This phenomenon raises interesting questions: under what conditions is a "noisy" distribution $Q$ sufficient, and what is the algorithmic cost of coping with this noise? We investigate these questions for the problem of estimating the sum of a multiset of $N$ real values $x_{1}, \dots, x_{N}$ . This problem is well-studied in the statistical literature in the case $P = Q$ , where the Hansen-Hurwitz estimator is frequently used. We assume that for some known…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Advanced Statistical Process Monitoring