Large-Scale Gaussian Processes via Alternating Projection
Kaiwen Wu, Jonathan Wenger, Haydn Jones, Geoff Pleiss, Jacob R., Gardner

TL;DR
This paper introduces an efficient alternating projection method for large-scale Gaussian processes that reduces computational complexity, enabling faster training and inference on datasets with millions of points.
Contribution
The paper presents a novel iterative algorithm based on alternating projection that accesses only subblocks of the kernel matrix, achieving linear convergence and scalability for large datasets.
Findings
Achieves up to 27x speed-up in training
Achieves up to 72x speed-up in inference
Effective on datasets with up to four million points
Abstract
Training and inference in Gaussian processes (GPs) require solving linear systems with kernel matrices. To address the prohibitive time complexity, recent work has employed fast iterative methods, like conjugate gradients (CG). However, as datasets increase in magnitude, the kernel matrices become increasingly ill-conditioned and still require space without partitioning. Thus, while CG increases the size of datasets GPs can be trained on, modern datasets reach scales beyond its applicability. In this work, we propose an iterative method which only accesses subblocks of the kernel matrix, effectively enabling mini-batching. Our algorithm, based on alternating projection, has per-iteration time and space complexity, solving many of the practical challenges of scaling GPs to very large datasets. Theoretically, we prove the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and ELM · Machine Learning and Data Classification
MethodsGreedy Policy Search
