Minimum complexity interpolation in random features models
Michael Celentano, Theodor Misiakiewicz, Andrea Montanari

TL;DR
This paper investigates the complexity of learning with generalized kernel norms, showing that for p>1, random features provide a tractable approximation, while for p=1, the problem is NP-hard.
Contribution
It introduces a novel analysis of random features approximation for generalized kernel norms, establishing tractability for p>1 and hardness for p=1.
Findings
Random features approximate the $_p$ norm efficiently for p>1.
Learning with $_1$ norm is NP-hard, indicating computational difficulty.
A new proof technique based on uniform concentration in the dual is developed.
Abstract
Despite their many appealing properties, kernel methods are heavily affected by the curse of dimensionality. For instance, in the case of inner product kernels in , the Reproducing Kernel Hilbert Space (RKHS) norm is often very large for functions that depend strongly on a small subset of directions (ridge functions). Correspondingly, such functions are difficult to learn using kernel methods. This observation has motivated the study of generalizations of kernel methods, whereby the RKHS norm -- which is equivalent to a weighted norm -- is replaced by a weighted functional norm, which we refer to as norm. Unfortunately, tractability of these approaches is unclear. The kernel trick is not available and minimizing these norms requires to solve an infinite-dimensional convex problem. We study random features approximations to these norms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Machine Learning and Algorithms
