Controlling Entity Integrity with Key Sets

Miika Hannula; Xinyi Li; Sebastian Link

arXiv:2101.02472·cs.DB·January 8, 2021

Controlling Entity Integrity with Key Sets

Miika Hannula, Xinyi Li, Sebastian Link

PDF

Open Access

TL;DR

This paper introduces efficient algorithms and theoretical foundations for managing key sets to control entity integrity in databases, addressing practical limitations of primary keys.

Contribution

It presents a linear-time validation algorithm for key sets, a binary axiomatization for implication, and a quadratic-time decision procedure for unary key sets.

Findings

01

Linear-time validation algorithm for key sets.

02

Binary axiomatization for implication problem.

03

Quadratic-time decision procedure for unary key sets.

Abstract

Codd's rule of entity integrity stipulates that every table has a primary key. Hence, the attributes of the primary key carry unique and complete value combinations. In practice, data cannot always meet such requirements. Previous work proposed the superior notion of key sets for controlling entity integrity. We establish a linear-time algorithm for validating whether a given key set holds on a given data set, and demonstrate its efficiency on real-world data. We establish a binary axiomatization for the associated implication problem, and prove its coNP-completeness. However, the implication of unary by arbitrary key sets has better properties. The fragment enjoys a unary axiomatization and is decidable in quadratic time. Hence, we can minimize overheads before validating key sets. While perfect models do not always exist in general, we show how to compute them for any instance of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Data Quality and Management