Exact Gaussian Processes on a Million Data Points
Ke Alexander Wang, Geoff Pleiss, Jacob R. Gardner, Stephen Tyree,, Kilian Q. Weinberger, Andrew Gordon Wilson

TL;DR
This paper introduces a scalable method for exact Gaussian process inference on over a million data points using multi-GPU parallelization, enabling precise modeling on large datasets previously considered infeasible.
Contribution
The authors develop a multi-GPU parallel approach for exact GPs that overcomes previous computational limitations, allowing training on over a million data points without approximations.
Findings
Exact GPs trained on over a million points in less than 2 hours.
Demonstrated significant performance improvements over approximate methods.
Applicable to various kernels and data structures without constraints.
Abstract
Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points, a task previously thought to be impossible with current computing hardware, in less than 2 hours. Moreover, our approach is generally applicable, without constraints to grid data or specific kernel classes. Enabled by this scalability,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Control Systems and Identification
