Enhancing Variable Importance in Random Forests: A Novel Application of Global Sensitivity Analysis
Giulia Vannucci, Roberta Siciliano, Andrea Saltelli

TL;DR
This paper introduces a novel approach using Global Sensitivity Analysis to improve feature importance ranking in Random Forests, enhancing interpretability and understanding of the data generating process.
Contribution
It applies Global Sensitivity Analysis to supervised learning, providing a new method for feature importance assessment in Random Forests.
Findings
Effective in ranking features based on their influence.
Improves interpretability of Random Forest models.
Validates approach through simulation studies.
Abstract
The present work provides an application of Global Sensitivity Analysis to supervised machine learning methods such as Random Forests. These methods act as black boxes, selecting features in high--dimensional data sets as to provide accurate classifiers in terms of prediction when new data are fed into the system. In supervised machine learning, predictors are generally ranked by importance based on their contribution to the final prediction. Global Sensitivity Analysis is primarily used in mathematical modelling to investigate the effect of the uncertainties of the input variables on the output. We apply it here as a novel way to rank the input features by their importance to the explainability of the data generating process, shedding light on how the response is determined by the dependence structure of its predictors. A simulation study shows that our proposal can be used to explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbabilistic and Robust Engineering Design
