Foundation Posteriors for Approximate Probabilistic Inference
Mike Wu, Noah Goodman

TL;DR
This paper introduces a neural network-based foundation posterior for probabilistic inference, trained via masked language modeling to perform zero-shot and fine-tuned inference across diverse probabilistic programs, reducing hyper-parameter tuning and computational costs.
Contribution
It formulates probabilistic inference as masked language modeling, creating a foundation model that generalizes across programs and enables efficient zero-shot and fine-tuned inference.
Findings
Effective zero-shot inference on STAN programs
Outperforms traditional inference methods in flexibility and efficiency
Can be fine-tuned for specific programs and datasets
Abstract
Probabilistic programs provide an expressive representation language for generative models. Given a probabilistic program, we are interested in the task of posterior inference: estimating a latent variable given a set of observed variables. Existing techniques for inference in probabilistic programs often require choosing many hyper-parameters, are computationally expensive, and/or only work for restricted classes of programs. Here we formulate inference as masked language modeling: given a program, we generate a supervised dataset of variables and assignments, and randomly mask a subset of the assignments. We then train a neural network to unmask the random values, defining an approximate posterior distribution. By optimizing a single neural network across a range of programs we amortize the cost of training, yielding a "foundation" posterior able to do zero-shot inference for new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Topic Modeling
MethodsVariational Inference
