Nonparametric IPSS: Fast, flexible feature selection with false discovery control
Omar Melikechi, David B. Dunson, and Jeffrey W. Miller

TL;DR
This paper introduces a nonparametric feature selection method with finite-sample false discovery control, applicable to high-dimensional data, and demonstrates its effectiveness in biological data analysis.
Contribution
It proposes a novel, flexible feature selection approach based on integrated path stability selection with false discovery control, applicable to arbitrary importance scores.
Findings
Accurately controls false discovery rate in simulations
Detects more true positives than existing methods
Runs efficiently on high-dimensional data
Abstract
Feature selection is a critical task in machine learning and statistics. However, existing feature selection methods either (i) rely on parametric methods such as linear or generalized linear models, (ii) lack theoretical false discovery control, or (iii) identify few true positives. Here, we introduce a general feature selection method with finite-sample false discovery control based on applying integrated path stability selection (IPSS) to arbitrary feature importance scores. The method is nonparametric whenever the importance scores are nonparametric, and it estimates q-values, which are better suited to high-dimensional data than p-values. We focus on two special cases using importance scores from gradient boosting (IPSSGB) and random forests (IPSSRF). Extensive nonlinear simulations with RNA sequencing data show that both methods accurately control the false discovery rate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsControl Systems and Identification · Fault Detection and Control Systems · Advanced Control Systems Optimization
MethodsFocus · Feature Selection
