Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages
Zheng Wei, Erin M. Conlon

TL;DR
This paper introduces a two-stage parallel MCMC method tailored for Bayesian hierarchical models with big data, partitioning data by groups to improve efficiency and reduce computation time while maintaining accuracy.
Contribution
It extends existing parallel MCMC approaches by partitioning data by groups in hierarchical models, enhancing efficiency and scalability for big data applications.
Findings
Results closely match full data analysis.
Significant increase in MCMC efficiency.
Substantial reduction in computation time.
Abstract
Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Bayesian Methods and Mixture Models
