PPL Bench: Evaluation Framework For Probabilistic Programming Languages
Sourabh Kulkarni, Kinjal Divesh Shah, Nimar Arora, Xiaoyan Wang, Yucen, Lily Li, Nazanin Khosravani Tehrani, Michael Tingley, David Noursi, Narjes, Torabi, Sepehr Akhavan Masouleh, Eric Lippert, and Erik Meijer

TL;DR
PPL Bench is a comprehensive evaluation framework for probabilistic programming languages, enabling standardized assessment of accuracy and convergence speed across models and implementations.
Contribution
It introduces a publicly available benchmark with evaluation tools and encourages community contributions to improve PPL assessment and selection.
Findings
Provides a standardized way to evaluate PPLs
Includes metrics like effective sample size and $\hat{r}$
Facilitates comparison of accuracy and speed of convergence
Abstract
We introduce PPL Bench, a new benchmark for evaluating Probabilistic Programming Languages (PPLs) on a variety of statistical models. The benchmark includes data generation and evaluation code for a number of models as well as implementations in some common PPLs. All of the benchmark code and PPL implementations are available on Github. We welcome contributions of new models and PPLs and as well as improvements in existing PPL implementations. The purpose of the benchmark is two-fold. First, we want researchers as well as conference reviewers to be able to evaluate improvements in PPLs in a standardized setting. Second, we want end users to be able to pick the PPL that is most suited for their modeling application. In particular, we are interested in evaluating the accuracy and speed of convergence of the inferred posterior. Each PPL only needs to provide posterior samples given a model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Software Testing and Debugging Techniques · Parallel Computing and Optimization Techniques
