Inference of Common Multidimensional Equally-Distributed Attributes
Alejandro Alvarez-Ayllon, Manuel Palomo-Duarte, Juan-Manuel Dodero

TL;DR
This paper introduces a statistical method for inferring common multidimensional attributes between relations, especially useful when metadata is incomplete or data volume is large, by replacing set-theoretic rules with a hierarchy of null hypotheses.
Contribution
It formalizes the impact of statistical tests on inclusion dependency inference and proposes a novel hierarchy-based approach for discovering equally distributed attribute sets.
Findings
Statistical tests can replace set-theoretic rules with limitations.
A hierarchy of null hypotheses enables multi-dimensional attribute discovery.
The method is effective on incomplete or large-scale data samples.
Abstract
Given two relations containing multiple measurements - possibly with uncertainties - our objective is to find which sets of attributes from the first have a corresponding set on the second, using exclusively a sample of the data. This approach could be used even when the associated metadata is damaged, missing or incomplete, or when the volume is too big for exact methods. This problem is similar to the search of Inclusion Dependencies (IND), a type of rule over two relations asserting that for a set of attributes X from the first, every combination of values appears on a set Y from the second. Existing IND can be found exploiting the existence of a partial order relation called specialization. However, this relation is based on set theory, requiring the values to be directly comparable. Statistical tests are an intuitive possible replacement, but it has not been studied how would they…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Semantic Web and Ontologies · Advanced Database Systems and Queries
