Robust inference with knockoffs
Rina Foygel Barber, Emmanuel J. Cand\`es, Richard J. Samworth

TL;DR
This paper extends the model-X knockoffs framework for variable selection by analyzing its robustness when the feature distribution is estimated rather than known exactly, showing that false discovery rate inflation is proportional to estimation errors.
Contribution
It provides theoretical guarantees on the robustness of knockoffs when the feature distribution is estimated, not known exactly, broadening practical applicability.
Findings
False discovery rate inflation is proportional to distribution estimation errors.
The method remains effective in high-dimensional settings.
Applicable to genome-wide association studies with estimated feature distributions.
Abstract
We consider the variable selection problem, which seeks to identify important variables influencing a response out of many candidate features . We wish to do so while offering finite-sample guarantees about the fraction of false positives - selected variables that in fact have no effect on after the other features are known. When the number of features is large (perhaps even larger than the sample size ), and we have no prior knowledge regarding the type of dependence between and , the model-X knockoffs framework nonetheless allows us to select a model with a guaranteed bound on the false discovery rate, as long as the distribution of the feature vector is exactly known. This model selection procedure operates by constructing "knockoff copies'" of each of the features, which are then used as a control group to ensure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
