Feature Selection via Regularized Trees

Houtao Deng; George Runger

arXiv:1201.1587·cs.LG·March 22, 2012·55 cites

Feature Selection via Regularized Trees

Houtao Deng, George Runger

PDF

Open Access

TL;DR

This paper introduces a tree regularization framework that enhances feature selection in tree-based models by penalizing the selection of features with similar gains, improving the quality of selected features across various classifiers.

Contribution

The paper presents a novel regularization framework for tree models that efficiently performs feature selection by penalizing similar feature gains, applicable to multiple tree algorithms.

Findings

01

Regularized trees select high-quality feature subsets.

02

Framework improves feature selection for both strong and weak classifiers.

03

Applicable to various tree models like random forests and boosted trees.

Abstract

We propose a tree regularization framework, which enables many tree models to perform feature selection efficiently. The key idea of the regularization framework is to penalize selecting a new feature for splitting when its gain (e.g. information gain) is similar to the features used in previous splits. The regularization framework is applied on random forest and boosted trees here, and can be easily applied to other tree models. Experimental studies show that the regularized trees can select high-quality feature subsets with regard to both strong and weak classifiers. Because tree models can naturally deal with categorical and numerical variables, missing values, different scales between variables, interactions and nonlinearities etc., the tree regularization framework provides an effective and efficient feature selection solution for many practical problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Face and Expression Recognition · Data Mining Algorithms and Applications