Bayesian cross-validation by parallel Markov Chain Monte Carlo
Alex Cooper, Aki Vehtari, Catherine Forbes, Lauren Kennedy, Dan, Simpson

TL;DR
This paper introduces a parallel MCMC-based cross-validation method for Bayesian models that leverages GPU hardware to significantly reduce computation time, enabling scalable and flexible model assessment.
Contribution
It presents a novel parallel MCMC approach for Bayesian cross-validation that is fast, flexible, and scalable, with new diagnostics for MCMC convergence and stationarity.
Findings
Parallel MCMC can match the speed of a single model fit.
The method supports various data partitioning schemes and scoring rules.
Online algorithms enable scaling to large datasets on memory-limited hardware.
Abstract
Brute force cross-validation (CV) is a method for predictive assessment and model selection that is general and applicable to a wide range of Bayesian models. Naive or `brute force' CV approaches are often too computationally costly for interactive modeling workflows, especially when inference relies on Markov chain Monte Carlo (MCMC). We propose overcoming this limitation using massively parallel MCMC. Using accelerator hardware such as graphics processor units (GPUs), our approach can be about as fast (in wall clock time) as a single full-data model fit. Parallel CV is flexible because it can easily exploit a wide range data partitioning schemes, such as those designed for non-exchangeable data. It can also accommodate a range of scoring rules. We propose MCMC diagnostics, including a summary of MCMC mixing based on the popular potential scale reduction factor () and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
