Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large   Data

M{\aa}ns Magnusson; Michael Riis Andersen; Johan Jonasson; and Aki; Vehtari

arXiv:2001.00980·stat.ME·August 12, 2020·AISTATS·30 cites

Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data

M{\aa}ns Magnusson, Michael Riis Andersen, Johan Jonasson, and Aki, Vehtari

PDF

Open Access

TL;DR

This paper introduces an efficient method for Bayesian model comparison on large datasets by combining approximate leave-one-out surrogates with exact subsampling, significantly improving efficiency and comparison accuracy.

Contribution

It proposes a novel combination of approximate and exact LOO methods for scalable and more effective Bayesian model comparison in large data settings.

Findings

01

Method is significantly more efficient than previous approaches.

02

Improves accuracy in model comparison tasks.

03

Provides theoretical proofs for scalability.

Abstract

Recently, new methods for model assessment, based on subsampling and posterior approximations, have been proposed for scaling leave-one-out cross-validation (LOO) to large datasets. Although these methods work well for estimating predictive performance for individual models, they are less powerful in model comparison. We propose an efficient method for estimating differences in predictive performance by combining fast approximate LOO surrogates with exact LOO subsampling using the difference estimator and supply proofs with regards to scaling characteristics. The resulting approach can be orders of magnitude more efficient than previous approaches, as well as being better suited to model comparison.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Gaussian Processes and Bayesian Inference