Large-scale Nonlinear Variable Selection via Kernel Random Features

Magda Gregorov\'a; Jason Ramapuram; Alexandros Kalousis; St\'ephane; Marchand-Maillet

arXiv:1804.07169·cs.LG·September 5, 2018

Large-scale Nonlinear Variable Selection via Kernel Random Features

Magda Gregorov\'a, Jason Ramapuram, Alexandros Kalousis, St\'ephane, Marchand-Maillet

PDF

TL;DR

This paper introduces a scalable kernel-based variable selection method for nonlinear regression that efficiently handles large datasets by using random features, improving variable relevance discovery and prediction accuracy.

Contribution

It presents the first kernel-based variable selection technique suitable for large datasets, leveraging random features to improve scalability and model relevance.

Findings

01

Outperforms existing methods on synthetic datasets

02

Effective in real-world large-scale applications

03

Accurately identifies relevant variables in nonlinear models

Abstract

We propose a new method for input variable selection in nonlinear regression. The method is embedded into a kernel regression machine that can model general nonlinear functions, not being a priori limited to additive models. This is the first kernel-based variable selection method applicable to large datasets. It sidesteps the typical poor scaling properties of kernel methods by mapping the inputs into a relatively low-dimensional space of random features. The algorithm discovers the variables relevant for the regression task together with learning the prediction model through learning the appropriate nonlinear random feature maps. We demonstrate the outstanding performance of our method on a set of large-scale synthetic and real datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.