EntryPrune: Neural Network Feature Selection using First Impressions
Felix Zimmer, Patrik Okanovic, Torsten Hoefler

TL;DR
EntryPrune is a new neural network feature selection method that uses a dynamic sparse input layer and entry-based pruning, outperforming existing techniques in accuracy and runtime across multiple datasets.
Contribution
We introduce EntryPrune, a novel feature selection algorithm utilizing a dynamic sparse input layer and entry-based pruning, advancing neural network interpretability and efficiency.
Findings
Outperforms state-of-the-art feature selection methods on 13 datasets.
Improves average accuracy on low-dimensional datasets.
Achieves lower runtime than competing approaches.
Abstract
There is an ongoing effort to develop feature selection algorithms to improve interpretability, reduce computational resources, and minimize overfitting in predictive models. Neural networks stand out as architectures on which to build feature selection methods, and recently, neuron pruning and regrowth have emerged from the sparse neural network literature as promising new tools. We introduce EntryPrune, a novel supervised feature selection algorithm using a dense neural network with a dynamic sparse input layer. It employs entry-based pruning, a novel approach that compares neurons based on their relative change induced when they have entered the network. Extensive experiments on 13 different datasets show that our approach generally outperforms the current state-of-the-art methods, and in particular improves the average accuracy on low-dimensional datasets. Furthermore, we show that…
Peer Reviews
Decision·Submitted to ICLR 2026
The introduction of entry-based pruning; measuring the initial impact is reasonable and normalization method used in the techniques ensures fair comparison. Results show some advantage in runtime in long datasets and marginal improvement over the existing techniques.
1. The core contribution of entry-based pruning is incremental at best. The main parts of the technique, gradient based regrowth and pruning largely borrows from prior works like NeuroFS and RigL. The entry-based pruning technique is simply a minor adaptation rather than a truly novel contribution to the field. 2. The experimental setup with MLP of 1 hidden layer with 100 neurons and large network containing two layers, is too basic and fails to offer a convincing benchmark for current applicabi
1. The pruning approach proposed in this paper is an interesting heuristic, which attempt to address the issue of unfair evaluation time between new and old neurons in dynamic sparse training. 2. Compared to NeuroFS and LassoNet, the proposed method may have lower computation time while maintaining comparable performance.
1. Dynamic sparse training is a widely researched and used approach. The method proposed in this manuscript is more like an incremental improvement on the existing NeuroFS framework. The most creative part is the introduction of a new pruning metric strategy. In addition, this manuscript avoids any theoretical analysis of its effectiveness. 2. The manuscript seems to have ignored GBDT baselines e.g., xgboost and catboost, in the main text, and appendix B also seems to show GBDT's powerful abili
- Novel method - Good experimental coverage (13 datasets) - The paper is well written and easy to follow
- Although the paper motivates feature selection as a path to interpretability, it does not connect its contribution to established explainability methods such as SHAP, LIME, Integrated Gradients, or Grad-CAM. Given that the method relies on gradients and is applied to image data, this omission weakens the interpretability claim - On wide datasets, the method offers little to no improvement over existing baselines - The experimental setup relies mainly on SVMs, which are somewhat outdated, inc
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsPruning · Feature Selection
