Mutual information for the selection of relevant variables in spectrometric nonlinear modelling
Fabrice Rossi (INRIA Rocquencourt / INRIA Sophia Antipolis), Amaury, Lendasse (CIS), Damien Fran\c{c}ois (CESAME), Vincent Wertz (CESAME), Michel, Verleysen (DICE - MLG)

TL;DR
This paper proposes using mutual information to select relevant variables from spectrometric data for nonlinear modeling, improving interpretability without sacrificing model performance.
Contribution
It introduces a mutual information-based variable selection method that enhances interpretability and is suitable for nonlinear models, without relying on variable combinations.
Findings
Effective variable selection with mutual information
Maintains model performance while increasing interpretability
Applicable to nonlinear spectrometric modeling
Abstract
Data from spectrophotometers form vectors of a large number of exploitable variables. Building quantitative models using these variables most often requires using a smaller set of variables than the initial one. Indeed, a too large number of input variables to a model results in a too large number of parameters, leading to overfitting and poor generalization abilities. In this paper, we suggest the use of the mutual information measure to select variables from the initial set. The mutual information measures the information content in input variables with respect to the model output, without making any assumption on the model that will be used; it is thus suitable for nonlinear modelling. In addition, it leads to the selection of variables among the initial set, and not to linear or nonlinear combinations of them. Without decreasing the model performances compared to other variable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
