FactorBase: SQL for Learning A Multi-Relational Graphical Model
Oliver Schulte, Zhensong Qian

TL;DR
FactorBase is an SQL-based framework that enables scalable learning of multi-relational Bayesian network models directly within relational databases, supporting complex data analysis tasks.
Contribution
It introduces a novel SQL-based system supporting multi-relational model learning, extending previous inference-focused systems to include model discovery capabilities.
Findings
Supports learning a first-order Bayesian network for entire databases
Facilitates fast and scalable model discovery using SQL constructs
Demonstrates effectiveness on six benchmark databases
Abstract
We describe FactorBase, a new SQL-based framework that leverages a relational database management system to support multi-relational model discovery. A multi-relational statistical model provides an integrated analysis of the heterogeneous and interdependent data resources in the database. We adopt the BayesStore design philosophy: statistical models are stored and managed as first-class citizens inside a database. Whereas previous systems like BayesStore support multi-relational inference, FactorBase supports multi-relational learning. A case study on six benchmark databases evaluates how our system supports a challenging machine learning application, namely learning a first-order Bayesian network model for an entire database. Model learning in this setting has to examine a large number of potential statistical associations across data tables. Our implementation shows how the SQL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Advanced Database Systems and Queries
