Reuse, recycle, reweigh: Combating influenza through efficient sequential Bayesian computation for massive data
Jennifer A. Tom, Janet S. Sinsheimer, Marc A. Suchard

TL;DR
This paper introduces an efficient Bayesian computation method that reuses intermediate results from stratified analyses to enable joint hierarchical modeling of massive datasets, demonstrated on influenza genome data.
Contribution
It extends the dynamic iterative reweighting MCMC algorithm to reuse stratified analysis realizations, improving Bayesian inference for large datasets.
Findings
Reweighted 687 influenza genomes over 13 years.
Enabled hierarchical analysis from stratified intermediate results.
Improved computational feasibility for massive Bayesian datasets.
Abstract
Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
