A Computational Exploration of Emerging Methods of Variable Importance Estimation
Louis Mozart Kamdem, Ernest Fokoue

TL;DR
This paper provides a computational and theoretical comparison of emerging variable importance estimation methods like LASSO, SVM, PERF, RF, and XGBOOST across various datasets, highlighting their strengths and limitations.
Contribution
It offers a comprehensive analysis of recent variable importance techniques, evaluating their performance and suitability for different data scenarios in machine learning.
Findings
PERF performs best with highly correlated data.
RF and PERF are fastest but require large datasets.
SVM is effective with redundant features.
Abstract
Estimating the importance of variables is an essential task in modern machine learning. This help to evaluate the goodness of a feature in a given model. Several techniques for estimating the importance of variables have been developed during the last decade. In this paper, we proposed a computational and theoretical exploration of the emerging methods of variable importance estimation, namely: Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine (SVM), the Predictive Error Function (PERF), Random Forest (RF), and Extreme Gradient Boosting (XGBOOST) that were tested on different kinds of real-life and simulated data. All these methods can handle both regression and classification tasks seamlessly but all fail when it comes to dealing with data containing missing values. The implementation has shown that PERF has the best performance in the case of highly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Face and Expression Recognition
MethodsSupport Vector Machine
