Scalable Probabilistic Databases with Factor Graphs and MCMC
Michael Wick, Andrew McCallum, Gerome Miklau

TL;DR
This paper introduces a scalable probabilistic database system that uses factor graphs and MCMC sampling to efficiently manage uncertainty and complex dependencies, enabling fast query evaluation over large, uncertain datasets.
Contribution
It presents a novel approach combining factor graphs with MCMC inference for scalable probabilistic databases, improving efficiency and flexibility over previous methods.
Findings
View maintenance techniques significantly speed up MCMC sampling.
The system effectively handles relational queries with aggregation.
Parallelization enhances scalability and performance.
Abstract
Probabilistic databases play a crucial role in the management and understanding of uncertain data. However, incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice modeling power, scalability, or restrict the class of relational algebra formula under which they are closed. We propose an alternative approach where the underlying relational database always represents a single world, and an external factor graph encodes a distribution over possible worlds; Markov chain Monte Carlo (MCMC) inference is then used to recover this uncertainty to a desired level of fidelity. Our approach allows the efficient evaluation of arbitrary queries over probabilistic databases with arbitrary dependencies expressed by graphical models with structure that changes during inference. MCMC sampling provides efficiency by hypothesizing {\em…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Bayesian Modeling and Causal Inference
