Iteratively reweighted kernel machines efficiently learn sparse functions

Libin Zhu; Damek Davis; Dmitriy Drusvyatskiy; Maryam Fazel

arXiv:2505.08277·stat.ML·October 6, 2025

Iteratively reweighted kernel machines efficiently learn sparse functions

Libin Zhu, Damek Davis, Dmitriy Drusvyatskiy, Maryam Fazel

PDF

TL;DR

This paper demonstrates that classical kernel methods can learn sparse, hierarchical functions efficiently by using derivatives for reweighting, challenging the notion that neural networks are uniquely capable of such tasks.

Contribution

It introduces an iterative reweighting approach for kernel machines that effectively learns sparse and hierarchical functions with low sample complexity.

Findings

01

Kernel derivatives identify influential data coordinates.

02

Iterative reweighting improves learning of hierarchical polynomials.

03

Numerical experiments validate the theoretical results.

Abstract

The impressive practical performance of neural networks is often attributed to their ability to learn low-dimensional data representations and hierarchical structure directly from data. In this work, we argue that these two phenomena are not unique to neural networks, and can be elicited from classical kernel methods. Namely, we show that the derivative of the kernel predictor can detect the influential coordinates with low sample complexity. Moreover, by iteratively using the derivatives to reweight the data and retrain kernel machines, one is able to efficiently learn hierarchical polynomials with finite leap complexity. Numerical experiments illustrate the developed theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.