On Sampling Random Features From Empirical Leverage Scores:   Implementation and Theoretical Guarantees

Shahin Shahrampour; Soheil Kolouri

arXiv:1903.08329·cs.LG·March 21, 2019·6 cites

On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees

Shahin Shahrampour, Soheil Kolouri

PDF

Open Access

TL;DR

This paper investigates empirical leverage score-based sampling for random features in kernel approximation, providing theoretical guarantees and demonstrating improved performance over traditional methods through experiments.

Contribution

It introduces a practical approach for data-dependent sampling of random features using empirical leverage scores, with theoretical performance bounds and empirical validation.

Findings

01

Empirical leverage score sampling outperforms Monte Carlo sampling in experiments.

02

The method is competitive with supervised kernel learning without using label information.

03

Theoretical bounds reveal a trade-off between kernel approximation and eigenvalue decay.

Abstract

Random features provide a practical framework for large-scale kernel approximation and supervised learning. It has been shown that data-dependent sampling of random features using leverage scores can significantly reduce the number of features required to achieve optimal learning bounds. Leverage scores introduce an optimized distribution for features based on an infinite-dimensional integral operator (depending on input distribution), which is impractical to sample from. Focusing on empirical leverage scores in this paper, we establish an out-of-sample performance bound, revealing an interesting trade-off between the approximated kernel and the eigenvalue decay of another kernel in the domain of random features defined based on data distribution. Our experiments verify that the empirical algorithm consistently outperforms vanilla Monte Carlo sampling, and with a minor modification the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques