# LAFS: A Fast, Differentiable Approach to Feature Selection Using Learnable Attention

**Authors:** Hıncal Topçuoğlu, Atıf Evren, Elif Tuna, Erhan Ustaoğlu

PMC · DOI: 10.3390/e28010020 · Entropy · 2025-12-24

## TL;DR

LAFS is a fast and accurate feature selection method using neural attention to balance performance and efficiency in machine learning.

## Contribution

LAFS introduces a differentiable, end-to-end framework for feature selection using learnable attention and a novel hybrid loss function.

## Key findings

- LAFS identifies complex feature interactions and handles multicollinearity effectively.
- LAFS achieves performance comparable to state-of-the-art methods like RFE-LGBM and FSA.
- The hybrid loss function successfully encourages sparse and non-redundant feature selection.

## Abstract

Feature selection is a critical preprocessing step for mitigating the curse of dimensionality in machine learning. Existing methods present a difficult trade-off: filter methods are fast but often suboptimal as they evaluate features in isolation, while wrapper methods are powerful but computationally prohibitive due to their iterative nature. In this paper, we propose LAFS (Learnable Attention for Feature Selection), a novel, end-to-end differentiable framework that achieves the performance of wrapper methods at the speed of simpler models. LAFS employs a neural attention mechanism to learn a context-aware importance score for all features simultaneously in a single forward pass. To encourage the selection of a sparse and non-redundant feature subset, we introduce a novel hybrid loss function that combines the standard classification objective with an information-theoretic entropic regularizer on the attention weights. We validate our approach on real-world high-dimensional benchmark datasets. Our experiments demonstrate that LAFS successfully identifies complex feature interactions and handles multicollinearity. In general comparison, LAFS achieves very close and accurate results to state-of-the-art RFE-LGBM and embedded FSA methods. Our work establishes a new point on the accuracy-efficiency frontier, demonstrating that attention-based architectures provide a compatible solution to the feature selection problem.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12839874/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12839874/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12839874/full.md

---
Source: https://tomesphere.com/paper/PMC12839874