When are Iterative Gaussian Processes Reliably Accurate?
Wesley J. Maddox, Sanyam Kapoor, Andrew Gordon Wilson

TL;DR
This paper investigates the reliability of iterative Gaussian process methods, identifying key parameters like CG tolerance and decomposition rank that improve accuracy and stability in hyperparameter learning.
Contribution
It provides practical guidelines for parameter settings in iterative Gaussian processes and demonstrates the effectiveness of L-BFGS-B optimizer for convergence.
Findings
Small CG tolerance ($.01$) improves numerical stability.
Large root decomposition size ($r 5000$) enhances accuracy.
L-BFGS-B optimizer achieves faster convergence.
Abstract
While recent work on conjugate gradient methods and Lanczos decompositions have achieved scalable Gaussian process inference with highly accurate point predictions, in several implementations these iterative methods appear to struggle with numerical instabilities in learning kernel hyperparameters, and poor test likelihoods. By investigating CG tolerance, preconditioner rank, and Lanczos decomposition rank, we provide a particularly simple prescription to correct these issues: we recommend that one should use a small CG tolerance () and a large root decomposition size (). Moreover, we show that L-BFGS-B is a compelling optimizer for Iterative GPs, achieving convergence with fewer gradient updates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Machine Learning and Algorithms
MethodsGaussian Process
