More Powerful Selective Kernel Tests for Feature Selection
Jen Ning Lim, Makoto Yamada, Wittawat Jitkrittum, Yoshikazu Terada,, Shigeyuki Matsui, Hidetoshi Shimodaira

TL;DR
This paper introduces enhanced selective kernel tests for feature selection that condition on minimal selection events, improving statistical power while maintaining control over false positive rates, demonstrated through synthetic and real data experiments.
Contribution
It extends recent feature selection methods by incorporating minimal conditioning events using multiscale bootstrap, leading to more powerful tests.
Findings
Proposed tests outperform existing methods in power across various scenarios.
The approach maintains controlled false positive rates.
Experimental validation on synthetic and real datasets confirms effectiveness.
Abstract
Refining one's hypotheses in the light of data is a common scientific practice; however, the dependency on the data introduces selection bias and can lead to specious statistical analysis. An approach for addressing this is via conditioning on the selection procedure to account for how we have used the data to generate our hypotheses, and prevent information to be used again after selection. Many selective inference (a.k.a. post-selection inference) algorithms typically take this approach but will "over-condition" for sake of tractability. While this practice yields well calibrated statistic tests with controlled false positive rates (FPR), it can incur a major loss in power. In our work, we extend two recent proposals for selecting features using the Maximum Mean Discrepancy and Hilbert Schmidt Independence Criterion to condition on the minimal conditioning event. We show how recent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
MethodsTest
