Principled Non-Linear Feature Selection
Dimitrios Athanasakis, John Shawe-Taylor, Delmiro Fernandez-Reyes

TL;DR
This paper introduces randSel, a scalable randomized feature selection method with theoretical guarantees, demonstrating strong empirical performance in bioinformatics and machine learning tasks, including competitive results in ICML challenges.
Contribution
The paper presents randSel, a novel randomized feature selection algorithm with proven probabilistic guarantees and improved scalability over existing greedy methods.
Findings
Achieved 3rd place in ICML black box learning challenge
Demonstrated competitive results in bioinformatics signal peptide prediction
Provided theoretical analysis with probabilistic guarantees for feature relevance
Abstract
Recent non-linear feature selection approaches employing greedy optimisation of Centred Kernel Target Alignment(KTA) exhibit strong results in terms of generalisation accuracy and sparsity. However, they are computationally prohibitive for large datasets. We propose randSel, a randomised feature selection algorithm, with attractive scaling properties. Our theoretical analysis of randSel provides strong probabilistic guarantees for correct identification of relevant features. RandSel's characteristics make it an ideal candidate for identifying informative learned representations. We've conducted experimentation to establish the performance of this approach, and present encouraging results, including a 3rd position result in the recent ICML black box learning challenge as well as competitive results for signal peptide prediction, an important problem in bioinformatics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · RNA and protein synthesis mechanisms · Gene expression and cancer classification
