TL;DR
This paper presents the knockoff filter, a novel method for variable selection that controls the false discovery rate in linear models, offering exact finite-sample guarantees without requiring noise level knowledge.
Contribution
The paper introduces the knockoff filter, a new flexible procedure that guarantees finite-sample FDR control regardless of design or coefficients, using manufactured knockoff variables.
Findings
Achieves exact FDR control in finite samples.
Demonstrates higher power than existing methods when many null variables.
Works with a broad class of test statistics, including Lasso-based methods.
Abstract
In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we need to know that the false discovery rate (FDR) - the expected fraction of false discoveries among all discoveries - is not too high, in order to assure the scientist that most of the discoveries are indeed true and replicable. This paper introduces the knockoff filter, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables. This method achieves exact FDR control in finite sample settings no matter the design or covariates, the number of variables in the model, or the amplitudes of the unknown regression coefficients, and does not require any knowledge of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
