Conjugate Gradients for Kernel Machines
Simon Bartels, Philipp Hennig

TL;DR
This paper introduces an improved conjugate gradient method for kernel ridge regression that efficiently approximates the kernel matrix, enabling scalable Gaussian process inference with additional uncertainty estimates.
Contribution
It presents a structured Gaussian regression approach that enhances conjugate gradients for kernel methods, providing better approximations and variance computations.
Findings
Improved approximation of kernel ridge regressor
Efficient computation of posterior variance and marginal likelihood
Scalable method for large datasets
Abstract
Regularized least-squares (kernel-ridge / Gaussian process) regression is a fundamental algorithm of statistics and machine learning. Because generic algorithms for the exact solution have cubic complexity in the number of datapoints, large datasets require to resort to approximations. In this work, the computation of the least-squares prediction is itself treated as a probabilistic inference problem. We propose a structured Gaussian regression model on the kernel function that uses projections of the kernel matrix to obtain a low-rank approximation of the kernel and the matrix. A central result is an enhanced way to use the method of conjugate gradients for the specific setting of least-squares regression as encountered in machine learning. Our method improves the approximation of the kernel ridge regressor / Gaussian process posterior mean over vanilla conjugate gradients and, allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Control Systems and Identification · Scientific Research and Discoveries
MethodsGaussian Process
