Rates of Convergence for Sparse Variational Gaussian Process Regression
David R. Burt, Carl E. Rasmussen, Mark van der Wilk

TL;DR
This paper analyzes the convergence rates of sparse variational Gaussian process regression, demonstrating that the number of inducing variables needed for accurate approximation grows slowly with dataset size, enabling scalable GP inference.
Contribution
It provides theoretical bounds on the KL divergence between the variational approximation and the true posterior, showing how to choose the number of inducing points for efficient, accurate GP regression.
Findings
KL divergence can be made arbitrarily small by increasing M more slowly than N
For D-dimensional inputs with Squared Exponential kernel, M=O(log^D N) suffices
Gaussian process posteriors can be approximated cheaply as datasets grow
Abstract
Excellent variational approximations to Gaussian process posteriors have been developed which avoid the scaling with dataset size . They reduce the computational cost to , with being the number of inducing variables, which summarise the process. While the computational cost seems to be linear in , the true complexity of the algorithm depends on how must increase to ensure a certain quality of approximation. We address this by characterising the behavior of an upper bound on the KL divergence to the posterior. We show that with high probability the KL divergence can be made arbitrarily small by growing more slowly than . A particular case of interest is that for regression with normally distributed inputs in D-dimensions with the popular Squared Exponential kernel, is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms · Control Systems and Identification
MethodsGaussian Process
