Active Hypothesis Testing under Computational Budgets with Applications to GWAS and LLM
Qi Kuang, Bowen Gang, Yin Xia

TL;DR
This paper introduces a resource-aware hypothesis testing framework that adaptively balances the computation of exact and proxy statistics, optimizing statistical efficiency within fixed budgets, with applications to GWAS and LLMs.
Contribution
It presents a novel adaptive procedure for hypothesis testing that guarantees valid p- or e-values under computational constraints, with proven optimality and empirical validation.
Findings
Achieves optimality for e-values and p-values under independence.
Guarantees valid p- or e-values within a fixed computational budget.
Demonstrates improved efficiency in GWAS and LLM applications.
Abstract
In large-scale hypothesis testing, computing exact -values or -values is often resource-intensive, creating a need for budget-aware inferential methods. We propose a general framework for active hypothesis testing that leverages inexpensive auxiliary statistics to allocate a global computational budget. For each hypothesis, our data-adaptive procedure probabilistically decides whether to compute the exact test statistic or a transformed proxy, guaranteeing a valid -value or -value while satisfying the exact budget constraint. Theoretical guarantees are established for our constructions, showing that the procedure achieves optimality for -values and for -values under independence, and admissibility for -values under general dependence. Empirical results from simulations and two real-world applications, including a large-scale genome-wide association study (GWAS) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
