KGroups: A Versatile Univariate Max-Relevance Min-Redundancy Feature Selection Algorithm for High-dimensional Biological Data
Malick Ebiele, Malika Bendechache, Rob Brennan

TL;DR
KGroups is a new univariate feature selection algorithm using clustering, achieving comparable predictive performance to multivariate methods while being significantly faster on high-dimensional biological data.
Contribution
Introduces KGroups, a novel univariate mRMR feature selection method employing clustering, offering a faster alternative with competitive accuracy.
Findings
KGroups achieves similar predictive performance to multivariate mRMR.
KGroups is up to 821 times faster than existing methods.
KGroups outperforms KBest in predictive accuracy.
Abstract
This paper proposes a new univariate filter feature selection (FFS) algorithm called KGroups. The majority of work in the literature focuses on investigating the relevance or redundancy estimations of feature selection (FS) methods. This has shown promising results and a real improvement of FFS methods' predictive performance. However, limited efforts have been made to investigate alternative FFS algorithms. This raises the following question: how much of the FFS methods' predictive performance depends on the selection algorithm rather than the relevance or the redundancy estimations? The majority of FFS methods fall into two categories: relevance maximisation (Max-Rel, also known as KBest) or simultaneous relevance maximisation and redundancy minimisation (mRMR). KBest is a univariate FFS algorithm that employs sorting (descending) for selection. mRMR is a multivariate FFS algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
