Learning Scalable Deep Kernels with Recurrent Structure

Maruan Al-Shedivat; Andrew Gordon Wilson; Yunus Saatchi; Zhiting Hu,; Eric P. Xing

arXiv:1610.08936·cs.LG·October 6, 2017·22 cites

Learning Scalable Deep Kernels with Recurrent Structure

Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu,, Eric P. Xing

PDF

Open Access 2 Repos

TL;DR

This paper introduces GP-LSTM, a novel Gaussian process kernel that captures recurrent structures like LSTMs, enabling scalable Bayesian recurrent models with state-of-the-art performance and valuable uncertainty estimates.

Contribution

We propose a new kernel for Gaussian processes that models recurrent structures, bridging the gap between LSTMs and Bayesian non-parametric methods.

Findings

01

Achieved state-of-the-art results on multiple benchmarks.

02

Demonstrated effective modeling of sequential data with uncertainty quantification.

03

Showcased application in autonomous driving with valuable predictive uncertainties.

Abstract

Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis