Learning Tuple Probabilities
Maximilian Dylla, Martin Theobald

TL;DR
This paper introduces a method for learning tuple probabilities in probabilistic databases from labeled data, framing it as an inverse confidence problem, and evaluates it experimentally against existing techniques.
Contribution
It presents a novel approach to learn base tuple probabilities in PDBs using an optimization algorithm, bridging a gap in SRL and PDB research.
Findings
The proposed method effectively learns tuple probabilities from labeled lineage data.
Experimental results show competitive performance compared to SRL and optimization techniques.
The approach is validated on real-world and synthetic datasets.
Abstract
Learning the parameters of complex probabilistic-relational models from labeled training data is a standard technique in machine learning, which has been intensively studied in the subfield of Statistical Relational Learning (SRL), but---so far---this is still an under-investigated topic in the context of Probabilistic Databases (PDBs). In this paper, we focus on learning the probability values of base tuples in a PDB from labeled lineage formulas. The resulting learning problem can be viewed as the inverse problem to confidence computations in PDBs: given a set of labeled query answers, learn the probability values of the base tuples, such that the marginal probabilities of the query answers again yield in the assigned probability labels. We analyze the learning problem from a theoretical perspective, cast it into an optimization problem, and provide an algorithm based on stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Bayesian Modeling and Causal Inference · Advanced Database Systems and Queries
