Regularized Greedy Column Subset Selection
Bruno Ordozgoiti, Alberto Mozo, Jes\'us Garc\'ia L\'opez de Lacalle

TL;DR
This paper introduces a regularized version of the Column Subset Selection Problem, along with a greedy algorithm that enhances robustness and stability in feature selection, especially with noisy or scarce data.
Contribution
It proposes a novel regularized formulation and an efficient greedy algorithm for feature selection, improving robustness and stability over existing methods.
Findings
Enhanced robustness to noise and scarce data
Improved conditioning of selected features
Maintains efficiency comparable to existing greedy algorithms
Abstract
The Column Subset Selection Problem provides a natural framework for unsupervised feature selection. Despite being a hard combinatorial optimization problem, there exist efficient algorithms that provide good approximations. The drawback of the problem formulation is that it incorporates no form of regularization, and is therefore very sensitive to noise when presented with scarce data. In this paper we propose a regularized formulation of this problem, and derive a correct greedy algorithm that is similar in efficiency to existing greedy methods for the unregularized problem. We study its adequacy for feature selection and propose suitable formulations. Additionally, we derive a lower bound for the error of the proposed problems. Through various numerical experiments on real and synthetic data, we demonstrate the significantly increased robustness and stability of our method, as well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
