Controlling Entity Integrity with Key Sets
Miika Hannula, Xinyi Li, Sebastian Link

TL;DR
This paper introduces efficient algorithms and theoretical foundations for managing key sets to control entity integrity in databases, addressing practical limitations of primary keys.
Contribution
It presents a linear-time validation algorithm for key sets, a binary axiomatization for implication, and a quadratic-time decision procedure for unary key sets.
Findings
Linear-time validation algorithm for key sets.
Binary axiomatization for implication problem.
Quadratic-time decision procedure for unary key sets.
Abstract
Codd's rule of entity integrity stipulates that every table has a primary key. Hence, the attributes of the primary key carry unique and complete value combinations. In practice, data cannot always meet such requirements. Previous work proposed the superior notion of key sets for controlling entity integrity. We establish a linear-time algorithm for validating whether a given key set holds on a given data set, and demonstrate its efficiency on real-world data. We establish a binary axiomatization for the associated implication problem, and prove its coNP-completeness. However, the implication of unary by arbitrary key sets has better properties. The fragment enjoys a unary axiomatization and is decidable in quadratic time. Hence, we can minimize overheads before validating key sets. While perfect models do not always exist in general, we show how to compute them for any instance of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Data Quality and Management
