Biased bootstrap sampling for efficient two-sample testing

Thomas P. S. Gillam; Christopher G. Lester

arXiv:1810.00335·physics.data-an·March 12, 2019

Biased bootstrap sampling for efficient two-sample testing

Thomas P. S. Gillam, Christopher G. Lester

PDF

TL;DR

This paper introduces a biased bootstrap sampling method to efficiently estimate the tails of the test statistic distribution in two-sample energy tests, enabling faster and more accurate high-confidence interval calculations.

Contribution

It presents a novel biased bootstrap technique that improves tail estimation efficiency in two-sample testing, with potential applications in other extreme value computations.

Findings

01

Enables quick evaluation of 5-sigma confidence intervals.

02

Reduces computational cost for tail probability estimation.

03

Applicable to broader bootstrap sampling extreme value problems.

Abstract

The so-called 'energy test' is a frequentist technique used in experimental particle physics to decide whether two samples are drawn from the same distribution. Its usage requires a good understanding of the distribution of the test statistic, T, under the null hypothesis. We propose a technique which allows the extreme tails of the T-distribution to be determined more efficiently than possible with present methods. This allows quick evaluation of (for example) 5-sigma confidence intervals that otherwise would have required prohibitively costly computation times or approximations to have been made. Furthermore, we comment on other ways that T computations could be sped up using established results from the statistics community. Beyond two-sample testing, the proposed biased bootstrap method may provide benefit anywhere extreme values are currently obtained with bootstrap sampling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.