An epistemic approach to model uncertainty in data-graphs
Sergio Abriola, Santiago Cifuentes, Mar\'ia Vanina Mart\'inez, Nina, Pardal, Edwin Pin

TL;DR
This paper introduces a probabilistic framework for modeling and addressing uncertainties and errors in graph databases, focusing on data cleaning and query answering under noisy observations.
Contribution
It extends the concept of probabilistic unclean databases to graph data, defining new computational problems and analyzing their complexity.
Findings
Defined probabilistic models for unclean graph databases.
Formulated data cleaning and query answering problems.
Analyzed computational complexity for different error types.
Abstract
Graph databases are becoming widely successful as data models that allow to effectively represent and process complex relationships among various types of data. As with any other type of data repository, graph databases may suffer from errors and discrepancies with respect to the real-world data they intend to represent. In this work we explore the notion of probabilistic unclean graph databases, previously proposed for relational databases, in order to capture the idea that the observed (unclean) graph database is actually the noisy version of a clean one that correctly models the world but that we know partially. As the factors that may be involved in the observation can be many, e.g, all different types of clerical errors or unintended transformations of the data, we assume a probabilistic model that describes the distribution over all possible ways in which the clean (uncertain)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Database Systems and Queries · Data Quality and Management
