Doubly stochastic large scale kernel learning with the empirical kernel   map

Nikolaas Steenbergen; Sebastian Schelter; Felix Bie{\ss}mann

arXiv:1609.00585·cs.LG·September 15, 2016

Doubly stochastic large scale kernel learning with the empirical kernel map

Nikolaas Steenbergen, Sebastian Schelter, Felix Bie{\ss}mann

PDF

Open Access

TL;DR

This paper introduces a scalable kernel learning method using doubly stochastic optimization of the empirical kernel map, enabling effective use of full kernel functions on large datasets without approximations.

Contribution

It presents a simple, parallelizable algorithm that scales kernel methods to large datasets by optimizing the empirical kernel map directly, avoiding kernel matrix approximations.

Findings

01

Works efficiently on large datasets

02

Leverages full kernel functions without approximations

03

Easily implementable in parallel computing environments

Abstract

With the rise of big data sets, the popularity of kernel methods declined and neural networks took over again. The main problem with kernel methods is that the kernel matrix grows quadratically with the number of data points. Most attempts to scale up kernel methods solve this problem by discarding data points or basis functions of some approximation of the kernel map. Here we present a simple yet effective alternative for scaling up kernel methods that takes into account the entire data set via doubly stochastic optimization of the emprical kernel map. The algorithm is straightforward to implement, in particular in parallel execution settings; it leverages the full power and versatility of classical kernel functions without the need to explicitly formulate a kernel map approximation. We provide empirical evidence that the algorithm works on large data sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Neural Networks and Applications · Stochastic Gradient Optimization Techniques