Active multiple testing with proxy p-values and e-values
Ziyu Xu, Catherine Wang, Larry Wasserman, Kathryn Roeder, Aaditya Ramdas

TL;DR
This paper introduces a flexible framework for hypothesis testing that uses proxy statistics to decide when to compute true test statistics, enabling resource-efficient multiple testing with FDR control.
Contribution
It proposes a novel active testing framework leveraging arbitrary proxy statistics for hypothesis testing, allowing resource-efficient FDR-controlled procedures.
Findings
High power and low resource usage demonstrated in simulations.
Framework produces valid p-values and e-values with proxies.
Effective in real scCRISPR screen experiments.
Abstract
Researchers often lack the resources to test every hypothesis of interest directly or compute test statistics comprehensively, but often possess auxiliary data from which we can compute an estimate of the experimental outcome. We introduce a novel approach for selecting which hypotheses to query a statistic (e.g., run an experiment, perform expensive computation, etc.) in a hypothesis testing setup by leveraging estimates to compute proxy statistics. Our framework allows a scientist to propose a proxy statistic and then query the true statistic with some probability based on the value of the proxy. We make no assumptions about how the proxy is derived, and it can be arbitrarily dependent on the true statistic. If the true statistic is not queried, the proxy is used in its place. We characterize "active" methods that produce valid p-values and e-values in this setting and utilize this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and Analog Circuit Testing · Statistical Methods in Clinical Trials · Software Testing and Debugging Techniques
