Stochastic Variational Deep Kernel Learning
Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing

TL;DR
This paper introduces a novel deep kernel learning model that combines deep neural networks with Gaussian processes, enabling scalable classification and multi-task learning with improved performance on large datasets.
Contribution
It proposes a new stochastic variational inference method for deep kernel learning that handles additive covariance structures and large-scale data efficiently.
Findings
Outperforms standalone deep networks, SVMs, and Gaussian processes on benchmarks.
Achieves scalable training on datasets with millions of points.
Demonstrates effectiveness on airline delay, CIFAR, and ImageNet datasets.
Abstract
Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Air Quality Monitoring and Forecasting · Domain Adaptation and Few-Shot Learning
MethodsDeep Kernel Learning · Gaussian Process
