Biased bootstrap sampling for efficient two-sample testing
Thomas P. S. Gillam, Christopher G. Lester

TL;DR
This paper introduces a biased bootstrap sampling method to efficiently estimate the tails of the test statistic distribution in two-sample energy tests, enabling faster and more accurate high-confidence interval calculations.
Contribution
It presents a novel biased bootstrap technique that improves tail estimation efficiency in two-sample testing, with potential applications in other extreme value computations.
Findings
Enables quick evaluation of 5-sigma confidence intervals.
Reduces computational cost for tail probability estimation.
Applicable to broader bootstrap sampling extreme value problems.
Abstract
The so-called 'energy test' is a frequentist technique used in experimental particle physics to decide whether two samples are drawn from the same distribution. Its usage requires a good understanding of the distribution of the test statistic, T, under the null hypothesis. We propose a technique which allows the extreme tails of the T-distribution to be determined more efficiently than possible with present methods. This allows quick evaluation of (for example) 5-sigma confidence intervals that otherwise would have required prohibitively costly computation times or approximations to have been made. Furthermore, we comment on other ways that T computations could be sped up using established results from the statistics community. Beyond two-sample testing, the proposed biased bootstrap method may provide benefit anywhere extreme values are currently obtained with bootstrap sampling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
