ABC of Order Dependencies
Pei Li, Michael Bohlen, Jaroslaw Szlichta, Divesh Srivastava

TL;DR
This paper introduces approximate band conditional order dependencies (abcODs) to model attribute relationships with small variations, and presents efficient algorithms for their discovery, significantly improving runtime and applicability in real-world data quality tasks.
Contribution
The paper proposes new algorithms for discovering approximate band order dependencies with improved efficiency and extends the framework to bidirectional dependencies, addressing practical challenges.
Findings
New O(n log n) algorithm for longest monotonic band discovery.
Efficient O(n^3 log n) algorithm for abcOD discovery considering codependencies.
Experimental validation on real-world and synthetic datasets.
Abstract
We enhance constrained-based data quality with approximate band conditional order dependencies (abcODs). Band ODs model the semantics of attributes that are monotonically related with small variations without there being an intrinsic violation of semantics. The class of abcODs generalizes band ODs to make them more relevant to real-world applications by relaxing them to hold approximately (abODs) with some exceptions and conditionally (bcODs) on subsets of the data. We study the problem of automatic dependency discovery over a hierarchy of abcODs. First, we propose a more efficient algorithm to discover abODs than in recent prior work. The algorithm is based on a new optimization to compute a longest monotonic band (longest subsequence of tuples that satisfy a band OD) through dynamic programming by decreasing the runtime from O(n^2) to O(n \log n) time. We then illustrate that while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Data Management and Algorithms
