Implementing Monte Carlo Tests with P-value Buckets

Axel Gandy; Georg Hahn; Dong Ding

arXiv:1703.09305·stat.ME·November 5, 2019·1 cites

Implementing Monte Carlo Tests with P-value Buckets

Axel Gandy, Georg Hahn, Dong Ding

PDF

Open Access

TL;DR

This paper introduces a method for Monte Carlo-based p-value testing that uses overlapping p-value buckets to reliably determine the bucket containing the true p-value, reducing resampling risk and improving efficiency.

Contribution

It proposes algorithms for p-value bucket identification with bounded resampling risk using overlapping buckets, ensuring finite runtime and better interpretability in statistical testing.

Findings

01

Algorithms bound resampling risk with overlapping buckets

02

Methods are suitable for standard software and multiple testing

03

Can be more computationally efficient than traditional methods

Abstract

Software packages usually report the results of statistical tests using p-values. Users often interpret these by comparing them to standard thresholds, e.g. 0.1%, 1% and 5%, which is sometimes reinforced by a star rating (***, **, *). We consider an arbitrary statistical test whose p-value p is not available explicitly, but can be approximated by Monte Carlo samples, e.g. by bootstrap or permutation tests. The standard implementation of such tests usually draws a fixed number of samples to approximate p. However, the probability that the exact and the approximated p-value lie on different sides of a threshold (the resampling risk) can be high, particularly for p-values close to a threshold. We present a method to overcome this. We consider a finite set of user-specified intervals which cover [0,1] and which can be overlapping. We call these p-value buckets. We present algorithms that,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Software Engineering Research · Software Reliability and Analysis Research