High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
Makoto Yamada, Wittawat Jitkrittum, Leonid Sigal, Eric P. Xing,, Masashi Sugiyama

TL;DR
This paper introduces a scalable feature selection method using kernelized Lasso to identify non-linear, statistically dependent features in high-dimensional data efficiently.
Contribution
It proposes a novel feature-wise kernelized Lasso approach that captures non-linear dependencies and can be computed efficiently for high-dimensional datasets.
Findings
Effectively identifies non-redundant features with strong dependence on output.
Scalable to high-dimensional problems with thousands of features.
Demonstrates superior performance in feature selection experiments.
Abstract
The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this paper, we consider a feature-wise kernelized Lasso for capturing non-linear input-output dependency. We first show that, with particular choices of kernel functions, non-redundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments with thousands of features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
