Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining
Ryan J. Urbanowicz, Randal S. Olson, Peter Schmitt, Melissa Meeker,, Jason H. Moore

TL;DR
This paper benchmarks Relief-Based feature selection methods for bioinformatics, demonstrating their flexibility, efficiency, and effectiveness in identifying complex feature associations across diverse biomedical data types.
Contribution
It introduces an open source framework ReBATE and provides a comprehensive comparison of RBAs, including a new method MultiSURF, across various simulated biomedical data scenarios.
Findings
RBAs are flexible and powerful for detecting various feature associations
MultiSURF performs well across diverse problem types
Identifies limitations of specific RBAs
Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. `omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the `Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Data Mining Algorithms and Applications · Gene expression and cancer classification
