Discovery of Approximate Differential Dependencies

Jixue Liu; Selasi Kwashie; Jiuyong Li; Feiyue Ye; Millist; Vincent

arXiv:1309.3733·cs.DB·September 17, 2013·1 cites

Discovery of Approximate Differential Dependencies

Jixue Liu, Selasi Kwashie, Jiuyong Li, Feiyue Ye, Millist, Vincent

PDF

Open Access

TL;DR

This paper introduces an algorithm for discovering differential dependencies (DDs) in data, which generalize functional dependencies by considering value distances, and analyzes how sampling impacts DD discovery.

Contribution

The paper proposes a novel algorithm for discovering DDs and provides a comprehensive analysis of their properties and the effects of sampling on DD discovery.

Findings

01

The algorithm effectively discovers DDs from data.

02

Sampling influences the accuracy of DD discovery.

03

Properties of DDs are characterized and analyzed.

Abstract

Differential dependencies (DDs) capture the relationships between data columns of relations. They are more general than functional dependencies (FDs) and and the difference is that DDs are defined on the distances between values of two tuples, not directly on the values. Because of this difference, the algorithms for discovering FDs from data find only special DDs, not all DDs and therefore are not applicable to DD discovery. In this paper, we propose an algorithm to discover DDs from data following the way of fixing the left hand side of a candidate DD to determine the right hand side. We also show some properties of DDs and conduct a comprehensive analysis on how sampling affects the DDs discovered from data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Advanced Database Systems and Queries · Data Mining Algorithms and Applications