Distributed Computation for Marginal Likelihood based Model Choice
Alexander Buchholz, Daniel Ahfock, Sylvia Richardson

TL;DR
This paper introduces a distributed Bayesian model choice method that efficiently approximates model evidence using local computations and minimal communication, enabling scalable analysis of large datasets with theoretical error bounds.
Contribution
It presents a novel divide-and-conquer approach for Bayesian model selection that combines local Monte Carlo estimates with correction techniques, extending to reversible jump scenarios.
Findings
Enables Bayesian model choice on large datasets with speed-ups.
Provides theoretical error bounds for the approximation.
Demonstrates effectiveness through real-world experiments.
Abstract
We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where a data set is split in non-overlapping subsets. These subsets are only accessed locally by individual workers and no data is shared between the workers. We approximate the model evidence for the full data set through Monte Carlo sampling from the posterior on every subset generating a model evidence per subset. The results are combined using a novel approach which corrects for the splitting using summary statistics of the generated samples. Our divide-and-conquer approach enables Bayesian model choice in the large data setting, exploiting all available information but limiting communication between workers. We derive theoretical error bounds that quantify the resulting trade-off between computational gain and loss in precision. The embarrassingly parallel nature yields important…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
