PPL Bench: Evaluation Framework For Probabilistic Programming Languages

Sourabh Kulkarni; Kinjal Divesh Shah; Nimar Arora; Xiaoyan Wang; Yucen; Lily Li; Nazanin Khosravani Tehrani; Michael Tingley; David Noursi; Narjes; Torabi; Sepehr Akhavan Masouleh; Eric Lippert; and Erik Meijer

arXiv:2010.08886·cs.PL·October 20, 2020·1 cites

PPL Bench: Evaluation Framework For Probabilistic Programming Languages

Sourabh Kulkarni, Kinjal Divesh Shah, Nimar Arora, Xiaoyan Wang, Yucen, Lily Li, Nazanin Khosravani Tehrani, Michael Tingley, David Noursi, Narjes, Torabi, Sepehr Akhavan Masouleh, Eric Lippert, and Erik Meijer

PDF

Open Access 1 Repo

TL;DR

PPL Bench is a comprehensive evaluation framework for probabilistic programming languages, enabling standardized assessment of accuracy and convergence speed across models and implementations.

Contribution

It introduces a publicly available benchmark with evaluation tools and encourages community contributions to improve PPL assessment and selection.

Findings

01

Provides a standardized way to evaluate PPLs

02

Includes metrics like effective sample size and $\hat{r}$

03

Facilitates comparison of accuracy and speed of convergence

Abstract

We introduce PPL Bench, a new benchmark for evaluating Probabilistic Programming Languages (PPLs) on a variety of statistical models. The benchmark includes data generation and evaluation code for a number of models as well as implementations in some common PPLs. All of the benchmark code and PPL implementations are available on Github. We welcome contributions of new models and PPLs and as well as improvements in existing PPL implementations. The purpose of the benchmark is two-fold. First, we want researchers as well as conference reviewers to be able to evaluate improvements in PPLs in a standardized setting. Second, we want end users to be able to pick the PPL that is most suited for their modeling application. In particular, we are interested in evaluating the accuracy and speed of convergence of the inferred posterior. Each PPL only needs to provide posterior samples given a model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/pplbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Software Testing and Debugging Techniques · Parallel Computing and Optimization Techniques