Learning Scalable Deep Kernels with Recurrent Structure
Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu,, Eric P. Xing

TL;DR
This paper introduces GP-LSTM, a novel Gaussian process kernel that captures recurrent structures like LSTMs, enabling scalable Bayesian recurrent models with state-of-the-art performance and valuable uncertainty estimates.
Contribution
We propose a new kernel for Gaussian processes that models recurrent structures, bridging the gap between LSTMs and Bayesian non-parametric methods.
Findings
Achieved state-of-the-art results on multiple benchmarks.
Demonstrated effective modeling of sequential data with uncertainty quantification.
Showcased application in autonomous driving with valuable predictive uncertainties.
Abstract
Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
