Feature selection when there are many influential features

Peter Hall; Jiashun Jin; Hugh Miller

arXiv:0911.4076·math.ST·July 10, 2014

Feature selection when there are many influential features

Peter Hall, Jiashun Jin, Hugh Miller

PDF

TL;DR

This paper proposes a new approach to feature selection suitable for scenarios with thousands of influential features, challenging traditional methods that focus on only a few, and provides theoretical and numerical analysis of its effectiveness.

Contribution

It introduces a general feature selection framework tailored for large numbers of relevant features, with new performance metrics and analytical insights.

Findings

01

The methodology performs well in high-dimensional settings.

02

Theoretical analysis supports the approach's effectiveness.

03

Numerical experiments demonstrate practical applicability.

Abstract

Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper, we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.