Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS
Feng Niu (University of Wisconsin-Madison), Christopher R\'e, (University of Wisconsin-Madison), AnHai Doan (University of, Wisconsin-Madison), Jude Shavlik (University of Wisconsin-Madison)

TL;DR
Tuffy introduces a scalable approach for Markov Logic Networks by integrating relational database optimizations, hybrid local search, and novel partitioning techniques, enabling efficient inference on large datasets.
Contribution
The paper presents three novel methods: a relational optimizer-based grounding, an RDBMS-based local search architecture, and theoretical insights for exponential efficiency improvements.
Findings
Outperforms existing MLN implementations in speed and quality
Enables scalable inference on large real-world datasets
Introduces new partitioning and parallel algorithms for MLNs
Abstract
Markov Logic Networks (MLNs) have emerged as a powerful framework that combines statistical and logical reasoning; they have been applied to many data intensive problems including information extraction, entity resolution, and text mining. Current implementations of MLNs do not scale to large real-world data sets, which is preventing their wide-spread adoption. We present Tuffy that achieves scalability via three novel contributions: (1) a bottom-up approach to grounding that allows us to leverage the full power of the relational optimizer, (2) a novel hybrid architecture that allows us to perform AI-style local search efficiently using an RDBMS, and (3) a theoretical insight that shows when one can (exponentially) improve the efficiency of stochastic local search. We leverage (3) to build novel partitioning, loading, and parallel algorithms. We show that our approach outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Data Management and Algorithms
