Fast methods for training Gaussian processes on large data sets

Christopher J. Moore; Alvin J. K. Chua; Christopher P. L. Berry,; Jonathan R. Gair

arXiv:1604.01250·stat.ML·May 16, 2016

Fast methods for training Gaussian processes on large data sets

Christopher J. Moore, Alvin J. K. Chua, Christopher P. L. Berry,, Jonathan R. Gair

PDF

TL;DR

This paper introduces efficient methods to accelerate Gaussian process regression on large datasets, focusing on reducing computational costs during learning and model comparison.

Contribution

The authors derive simple, effective techniques to speed up Gaussian process training and Bayesian model comparison for large data sets.

Findings

01

Significant speed-up in model training and comparison.

02

Validated methods on synthetic and real datasets.

03

Quantified computational improvements over nested sampling.

Abstract

Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.