Two-Stage Testing in a high dimensional setting
Marianne A Jonker, Luc van Schijndel, Eric Cator

TL;DR
This paper introduces a two-stage testing method for high-dimensional regression that efficiently detects two-way interactions by reducing computational load and improving statistical power, especially when the number of variables is extremely large.
Contribution
It proposes a novel two-stage testing procedure with proven asymptotic independence, enabling more powerful and computationally feasible interaction detection in ultra-high dimensional settings.
Findings
Controls type I error effectively
Increases statistical power over one-by-one testing
Reduces computational burden significantly
Abstract
In a high dimensional regression setting in which the number of variables () is much larger than the sample size (), the number of possible two-way interactions between the variables is immense. If the number of variables is in the order of one million, which is usually the case in e.g., genetics, the number of two-way interactions is of the order one million squared. In the pursuit of detecting two-way interactions, testing all pairs for interactions one-by-one is computational unfeasible and the multiple testing correction will be severe. In this paper we describe a two-stage testing procedure consisting of a screening and an evaluation stage. It is proven that, under some assumptions, the tests-statistics in the two stages are asymptotically independent. As a result, multiplicity correction in the second stage is only needed for the number of statistical tests that are actually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and Analog Circuit Testing
