Incorporating Integrity Constraints in Uncertain Databases
Naveen Ashish, Sharad Mehrotra, Pouria Pirzadeh

TL;DR
This paper presents a method to incorporate integrity constraints into probabilistic databases, reducing uncertainty and improving data quality while maintaining manageable query processing complexity.
Contribution
It introduces a novel approach to approximate uncertain relations with ICs, enabling efficient query evaluation with higher data quality.
Findings
Effective reduction of uncertainty in large datasets
Improved query answer quality with integrity constraints
Scalable approach demonstrated on real-world data
Abstract
We develop an approach to incorporate additional knowledge, in the form of general purpose integrity constraints (ICs), to reduce uncertainty in probabilistic databases. While incorporating ICs improves data quality (and hence quality of answers to a query), it significantly complicates query processing. To overcome the additional complexity, we develop an approach to map an uncertain relation U with ICs to another uncertain relation U', that approximates the set of consistent worlds represented by U. Queries over U can instead be evaluated over U' achieving higher quality (due to reduced uncertainty in U') without additional complexity in query processing due to ICs. We demonstrate the effectiveness and scalability of our approach to large data-sets with complex constraints. We also present experimental results demonstrating the utility of incorporating integrity constraints in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Data Quality and Management
