Data Sketching and Stacking: A Confluence of Two Strategies for Predictive Inference in Gaussian Process Regressions with High-Dimensional Features
Samuel Gailliot, Rajarshi Guhaniyogi, Roger D. Peng

TL;DR
This paper introduces a computationally efficient method for predictive inference in high-dimensional Gaussian process regressions by combining data sketching with Bayesian stacking, enabling fast and accurate predictions.
Contribution
It proposes a novel strategy that uses feature sketching and Bayesian stacking to improve predictive inference in high-dimensional GP regressions, bypassing MCMC limitations.
Findings
Outperforms existing methods in predictive accuracy.
Achieves faster computation with large feature sets.
Demonstrates effectiveness in air pollution prediction from satellite data.
Abstract
This article focuses on drawing computationally-efficient predictive inference from Gaussian process (GP) regressions with a large number of features when the response is conditionally independent of the features given the projection to a noisy low dimensional manifold. Bayesian estimation of the regression relationship using Markov Chain Monte Carlo and subsequent predictive inference is computationally prohibitive and may lead to inferential inaccuracies since accurate variable selection is essentially impossible in such high-dimensional GP regressions. As an alternative, this article proposes a strategy to sketch the high-dimensional feature vector with a carefully constructed sketching matrix, before fitting a GP with the scalar outcome and the sketched feature vector to draw predictive inference. The analysis is performed in parallel with many different sketching matrices and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference
