posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms
M{\aa}ns Magnusson, Jakob Torgander, Paul-Christian B\"urkner, Lu, Zhang, Bob Carpenter, Aki Vehtari

TL;DR
Posteriordb is a comprehensive database of models and data sets designed to evaluate and benchmark Bayesian inference algorithms across diverse, realistic target densities, aiding in the development of more robust probabilistic programming tools.
Contribution
The paper introduces posteriordb, a new database of models and data sets with reference solutions, facilitating standardized evaluation of Bayesian inference algorithms.
Findings
Posteriordb includes 120 models for diverse inference testing.
It has been used to develop several advanced inference algorithms.
Provides best practices for model evaluation and comparison.
Abstract
The generality and robustness of inference algorithms is critical to the success of widely used probabilistic programming languages such as Stan, PyMC, Pyro, and Turing.jl. When designing a new general-purpose inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem arises in evaluating its accuracy and efficiency across a range of representative target models. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. We further provide a guide to the best practices in using posteriordb for model evaluation and comparison. To provide a wide range of realistic target densities, posteriordb currently comprises 120 representative models and has been instrumental in developing several general inference algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification
