Two-Stage Metropolis-Hastings for Tall Data
Richard D. Payne, Bani K. Mallick

TL;DR
This paper introduces a two-stage Metropolis-Hastings algorithm to efficiently handle tall data in Bayesian binary classification, significantly reducing computational costs while maintaining accuracy.
Contribution
It proposes a novel two-stage Metropolis-Hastings method that integrates with consensus Monte Carlo to improve efficiency in tall data Bayesian analysis.
Findings
Reduces likelihood computation costs in tall data scenarios
Effective in logistic and hierarchical logistic regression
Compatible with existing parallelization frameworks
Abstract
This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to reduce the exact likelihood computational cost in the tall data situation. In the first stage, a new proposal is tested by the approximate likelihood based model. The full likelihood based posterior computation will be conducted only if the proposal passes the first stage screening. Furthermore, this method can be adopted into the consensus Monte Carlo framework. The two-stage method is applied to logistic regression, hierarchical logistic regression, and Bayesian multivariate adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Neural Networks and Applications
