Matching Dependencies with Arbitrary Attribute Values: Semantics, Query Answering and Integrity Constraints
Jaffer Gardezi, Leopoldo Bertossi, Iluju Kiringa

TL;DR
This paper explores the semantics, properties, and computational complexity of enforcing matching dependencies in databases, including query answering and connections to database repairs, for the pure case where any value can be used for matching.
Contribution
It characterizes clean instances and answers, analyzes complexity, and links matching dependencies with database repairs under integrity constraints.
Findings
Identifies tractable and intractable cases for data cleaning
Defines invariant 'clean answers' to queries after MD enforcement
Establishes connections between MDs and database repairs
Abstract
Matching dependencies (MDs) were introduced to specify the identification or matching of certain attribute values in pairs of database tuples when some similarity conditions are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the "pure case" of MDs, any value from the underlying data domain can be used for the value in common that does the matching. We investigate the semantics and properties of data cleaning through the enforcement of matching dependencies for the pure case. We characterize the intended clean instances and also the "clean answers" to queries as those that are invariant under the cleaning process. The complexity of computing clean instances and clean answers to queries is investigated. Tractable and intractable cases depending on the MDs and queries are identified. Finally, we establish connections with database…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Semantic Web and Ontologies
