Advances in Feature Selection with Mutual Information
Michel Verleysen (DICE - MLG), Fabrice Rossi (LTCI), Damien, Fran\c{c}ois (CESAME)

TL;DR
This paper reviews recent advances in feature selection using mutual information, addressing key limitations through resampling techniques and redefining the criterion to measure feature complementarity, thus improving high-dimensional data analysis.
Contribution
It introduces methods to overcome mutual information estimation challenges and proposes a new criterion based on feature complementarity for better feature selection.
Findings
Resampling techniques help select estimator parameters and determine stopping points.
Modified mutual information measures feature complementarity, not just relevance.
Enhanced feature selection methods improve performance in high-dimensional data.
Abstract
The selection of features that are relevant for a prediction or classification problem is an important problem in many domains involving high-dimensional data. Selecting features helps fighting the curse of dimensionality, improving the performances of prediction or classification methods, and interpreting the application. In a nonlinear context, the mutual information is widely used as relevance criterion for features and sets of features. Nevertheless, it suffers from at least three major limitations: mutual information estimators depend on smoothing parameters, there is no theoretically justified stopping criterion in the feature selection greedy procedure, and the estimation itself suffers from the curse of dimensionality. This chapter shows how to deal with these problems. The two first ones are addressed by using resampling techniques that provide a statistical basis to select the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
