TL;DR
LinCQA introduces a linear-time method for consistent query answering on a class of acyclic queries, significantly improving efficiency over existing systems and extending Yannakakis's algorithm to inconsistent data.
Contribution
The paper presents LinCQA, a system that provides linear-time SQL rewritings for a class of acyclic queries in consistent query answering, generalizing Yannakakis's algorithm.
Findings
LinCQA often outperforms existing CQA systems.
The method achieves linear time guarantees for the specified query class.
Experimental results show orders of magnitude improvements in performance.
Abstract
Most data analytical pipelines often encounter the problem of querying inconsistent data that violate pre-determined integrity constraints. Data cleaning is an extensively studied paradigm that singles out a consistent repair of the inconsistent data. Consistent query answering (CQA) is an alternative approach to data cleaning that asks for all tuples guaranteed to be returned by a given query on all (in most cases, exponentially many) repairs of the inconsistent data. This paper identifies a class of acyclic select-project-join (SPJ) queries for which CQA can be solved via SQL rewriting with a linear time guarantee. Our rewriting method can be viewed as a generalization of Yannakakis's algorithm for acyclic joins to the inconsistent setting. We present LinCQA, a system that can output rewritings in both SQL and non-recursive Datalog rules for every query in this class. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
