The effect of different feature selection methods on models created with   XGBoost

Jorge Neyra; Vishal B. Siramshetty; and Huthaifa I. Ashqar

arXiv:2411.05937·cs.LG·November 12, 2024

The effect of different feature selection methods on models created with XGBoost

Jorge Neyra, Vishal B. Siramshetty, and Huthaifa I. Ashqar

PDF

Open Access

TL;DR

This paper investigates how various feature selection techniques impact XGBoost models, finding that feature reduction methods do not significantly affect accuracy but may reduce computational costs.

Contribution

It demonstrates that feature selection methods do not significantly change XGBoost accuracy, challenging traditional assumptions about noise removal and overfitting.

Findings

01

Feature selection methods do not significantly alter prediction accuracy.

02

Dimensionality reduction can lower computational complexity.

03

Traditional noise removal may not be necessary for XGBoost.

Abstract

This study examines the effect that different feature selection methods have on models created with XGBoost, a popular machine learning algorithm with superb regularization methods. It shows that three different ways for reducing the dimensionality of features produces no statistically significant change in the prediction accuracy of the model. This suggests that the traditional idea of removing the noisy training data to make sure models do not overfit may not apply to XGBoost. But it may still be viable in order to reduce computational complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIterative Learning Control Systems · Fuzzy Logic and Control Systems · Real-time simulation and control systems

MethodsFeature Selection