EntryPrune: Neural Network Feature Selection using First Impressions

Felix Zimmer; Patrik Okanovic; Torsten Hoefler

arXiv:2410.02344·cs.LG·October 8, 2025

EntryPrune: Neural Network Feature Selection using First Impressions

Felix Zimmer, Patrik Okanovic, Torsten Hoefler

PDF

Open Access 2 Repos 3 Reviews

TL;DR

EntryPrune is a new neural network feature selection method that uses a dynamic sparse input layer and entry-based pruning, outperforming existing techniques in accuracy and runtime across multiple datasets.

Contribution

We introduce EntryPrune, a novel feature selection algorithm utilizing a dynamic sparse input layer and entry-based pruning, advancing neural network interpretability and efficiency.

Findings

01

Outperforms state-of-the-art feature selection methods on 13 datasets.

02

Improves average accuracy on low-dimensional datasets.

03

Achieves lower runtime than competing approaches.

Abstract

There is an ongoing effort to develop feature selection algorithms to improve interpretability, reduce computational resources, and minimize overfitting in predictive models. Neural networks stand out as architectures on which to build feature selection methods, and recently, neuron pruning and regrowth have emerged from the sparse neural network literature as promising new tools. We introduce EntryPrune, a novel supervised feature selection algorithm using a dense neural network with a dynamic sparse input layer. It employs entry-based pruning, a novel approach that compares neurons based on their relative change induced when they have entered the network. Extensive experiments on 13 different datasets show that our approach generally outperforms the current state-of-the-art methods, and in particular improves the average accuracy on low-dimensional datasets. Furthermore, we show that…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 3

Strengths

The introduction of entry-based pruning; measuring the initial impact is reasonable and normalization method used in the techniques ensures fair comparison. Results show some advantage in runtime in long datasets and marginal improvement over the existing techniques.

Weaknesses

1. The core contribution of entry-based pruning is incremental at best. The main parts of the technique, gradient based regrowth and pruning largely borrows from prior works like NeuroFS and RigL. The entry-based pruning technique is simply a minor adaptation rather than a truly novel contribution to the field. 2. The experimental setup with MLP of 1 hidden layer with 100 neurons and large network containing two layers, is too basic and fails to offer a convincing benchmark for current applicabi

Reviewer 02Rating 4Confidence 4

Strengths

1. The pruning approach proposed in this paper is an interesting heuristic, which attempt to address the issue of unfair evaluation time between new and old neurons in dynamic sparse training. 2. Compared to NeuroFS and LassoNet, the proposed method may have lower computation time while maintaining comparable performance.

Weaknesses

1. Dynamic sparse training is a widely researched and used approach. The method proposed in this manuscript is more like an incremental improvement on the existing NeuroFS framework. The most creative part is the introduction of a new pruning metric strategy. In addition, this manuscript avoids any theoretical analysis of its effectiveness. 2. The manuscript seems to have ignored GBDT baselines e.g., xgboost and catboost, in the main text, and appendix B also seems to show GBDT's powerful abili

Reviewer 03Rating 2Confidence 3

Strengths

- Novel method - Good experimental coverage (13 datasets) - The paper is well written and easy to follow

Weaknesses

- Although the paper motivates feature selection as a path to interpretability, it does not connect its contribution to established explainability methods such as SHAP, LIME, Integrated Gradients, or Grad-CAM. Given that the method relies on gradients and is applied to image data, this omission weakens the interpretability claim - On wide datasets, the method offers little to no improvement over existing baselines - The experimental setup relies mainly on SVMs, which are somewhat outdated, inc

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsPruning · Feature Selection