FANOK: Knockoffs in Linear Time
Armin Askari, Quentin Rebjock, Alexandre d'Aspremont, Laurent El, Ghaoui

TL;DR
This paper introduces efficient algorithms for Gaussian model-X knockoffs that enable large-scale feature selection by reducing computational complexity, including methods for covariance estimation and sampling with linear time complexity.
Contribution
It presents novel algorithms for constructing Gaussian knockoffs with significantly improved computational efficiency, suitable for very high-dimensional data.
Findings
Algorithms scale to $p=500,000$ features.
Complexity reduced from $O(p^3)$ to $O(pk^2)$ with factor models.
Efficient covariance estimation and sampling methods developed.
Abstract
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. Identifying the knockoff distribution requires solving a large scale semidefinite program for which we derive several efficient methods. One handles generic covariance matrices, has a complexity scaling as where is the ambient dimension, while another assumes a rank factor model on the covariance matrix to reduce this complexity bound to . We also derive efficient procedures to both estimate factor models and sample knockoff covariates with complexity linear in the dimension. We test our methods on problems with as large as .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Machine Learning and Algorithms · Machine Learning and Data Classification
MethodsFeature Selection
