Forward-Backward Selection with Early Dropping
Giorgos Borboudakis, Ioannis Tsamardinos

TL;DR
This paper introduces a heuristic to accelerate forward-backward feature selection by discarding conditionally independent variables, achieving significant speedups while maintaining accuracy, and providing theoretical guarantees in certain distributions.
Contribution
The paper proposes a novel heuristic for feature selection that improves efficiency and offers theoretical guarantees for identifying the Markov blanket in specific probabilistic models.
Findings
Increases computational efficiency by about two orders of magnitude.
Selects fewer variables while maintaining predictive accuracy.
Performs similarly to LASSO when restricted to the same number of variables.
Abstract
Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that significantly improves its running time, while preserving predictive accuracy. The idea is to temporarily discard the variables that are conditionally independent with the outcome given the selected variable set. Depending on how those variables are reconsidered and reintroduced, this heuristic gives rise to a family of algorithms with increasingly stronger theoretical guarantees. In distributions that can be faithfully represented by Bayesian networks or maximal ancestral graphs, members of this algorithmic family are able to correctly identify the Markov blanket in the sample limit. In experiments we show that the proposed heuristic increases computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Fault Detection and Control Systems · Statistical Methods and Inference
