Robust inference with knockoffs

Rina Foygel Barber; Emmanuel J. Cand\`es; Richard J. Samworth

arXiv:1801.03896·stat.ME·February 12, 2019

Robust inference with knockoffs

Rina Foygel Barber, Emmanuel J. Cand\`es, Richard J. Samworth

PDF

TL;DR

This paper extends the model-X knockoffs framework for variable selection by analyzing its robustness when the feature distribution is estimated rather than known exactly, showing that false discovery rate inflation is proportional to estimation errors.

Contribution

It provides theoretical guarantees on the robustness of knockoffs when the feature distribution is estimated, not known exactly, broadening practical applicability.

Findings

01

False discovery rate inflation is proportional to distribution estimation errors.

02

The method remains effective in high-dimensional settings.

03

Applicable to genome-wide association studies with estimated feature distributions.

Abstract

We consider the variable selection problem, which seeks to identify important variables influencing a response $Y$ out of many candidate features $X_{1}, \dots, X_{p}$ . We wish to do so while offering finite-sample guarantees about the fraction of false positives - selected variables $X_{j}$ that in fact have no effect on $Y$ after the other features are known. When the number of features $p$ is large (perhaps even larger than the sample size $n$ ), and we have no prior knowledge regarding the type of dependence between $Y$ and $X$ , the model-X knockoffs framework nonetheless allows us to select a model with a guaranteed bound on the false discovery rate, as long as the distribution of the feature vector $X = (X_{1}, \dots, X_{p})$ is exactly known. This model selection procedure operates by constructing "knockoff copies'" of each of the $p$ features, which are then used as a control group to ensure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.