Discovering Conditionally Salient Features with Statistical Guarantees
Jaime Roquero Gimenez, James Zou

TL;DR
This paper introduces a method for conditional feature selection that generalizes the knockoff procedure, providing statistical guarantees for identifying features relevant in specific contexts while controlling false discoveries.
Contribution
It extends the knockoff framework to conditional feature selection, enabling region-specific relevance detection with FDR control without model assumptions.
Findings
The method successfully controls FDR in conditional feature selection scenarios.
Experimental results validate the theoretical FDR guarantees.
The algorithm effectively partitions feature space to improve relevance detection.
Abstract
The goal of feature selection is to identify important features that are relevant to explain an outcome variable. Most of the work in this domain has focused on identifying globally relevant features, which are features that are related to the outcome using evidence across the entire dataset. We study a more fine-grained statistical problem: conditional feature selection, where a feature may be relevant depending on the values of the other features. For example in genetic association studies, variant could be associated with the phenotype in the entire dataset, but conditioned on variant being present it might be independent of the phenotype. In this sense, variant is globally relevant, but conditioned on it is no longer locally relevant in that region of the feature space. We present a generalization of the knockoff procedure that performs conditional feature selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetic Associations and Epidemiology · Bioinformatics and Genomic Networks
MethodsFeature Selection
