Random feature approximation for general spectral methods
Mike Nguyen, Nicole M\"ucke

TL;DR
This paper provides a comprehensive theoretical analysis of random feature methods for spectral regularization, including neural networks, demonstrating optimal learning rates and extending prior results to broader classes of algorithms.
Contribution
It extends the analysis of random feature approximation to a wide range of spectral regularization techniques, including implicit and accelerated methods, and applies this to neural networks via NTK.
Findings
Achieves optimal learning rates over regularity classes.
Extends analysis to implicit schemes like gradient descent.
Provides theoretical insights into neural network training with NTK.
Abstract
Random feature approximation is arguably one of the most widely used techniques for kernel methods in large-scale learning algorithms. In this work, we analyze the generalization properties of random feature methods, extending previous results for Tikhonov regularization to a broad class of spectral regularization techniques. This includes not only explicit methods but also implicit schemes such as gradient descent and accelerated algorithms like the Heavy-Ball and Nesterov method. Through this framework, we enable a theoretical analysis of neural networks and neural operators through the lens of the Neural Tangent Kernel (NTK) approach trained via gradient descent. For our estimators we obtain optimal learning rates over regularity classes (even for classes that are not included in the reproducing kernel Hilbert space), which are defined through appropriate source conditions. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Gaussian Processes and Bayesian Inference
