Feature Selection for Regression Problems Based on the Morisita Estimator of Intrinsic Dimension
Jean Golay, Michael Leuenberger, Mikhail Kanevski

TL;DR
This paper presents a new feature selection method for regression that uses the Morisita estimator of intrinsic dimension to identify relevant features and reduce redundancy, demonstrated through extensive experiments.
Contribution
It introduces a novel supervised filter based on the Morisita estimator, providing clear visualization and easy implementation for feature relevance assessment.
Findings
Effective in identifying relevant features across various datasets
Outperforms RReliefF in real-world applications
Provides a new measure of feature relevance
Abstract
Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning tasks, such as regression. To address this problem, feature selection methods have been proposed. This paper introduces a new supervised filter based on the Morisita estimator of intrinsic dimension. It can identify relevant features and distinguish between redundant and irrelevant information. Besides, it offers a clear graphical representation of the results, and it can be easily implemented in different programming languages. Comprehensive numerical experiments are conducted using simulated datasets characterized by different levels of complexity, sample size and noise. The suggested algorithm is also successfully tested on a selection of real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
