Querying Incomplete Numerical Data: Between Certain and Possible Answers
Marco Console, Leonid Libkin, Liat Peterfreund

TL;DR
This paper develops a probabilistic framework for handling incomplete numerical data in databases, enabling meaningful query answers with arithmetic and aggregation by modeling missing values with probability distributions.
Contribution
It introduces a principled probabilistic model for numerical nulls, providing a compositional query answering framework with efficient approximation algorithms.
Findings
Queries are measurable and have finite representations.
Probability intervals offer meaningful answers where classical methods fail.
Approximation algorithms efficiently estimate probabilities for numerical query answers.
Abstract
Queries with aggregation and arithmetic operations, as well as incomplete data, are common in real-world database, but we lack a good understanding of how they should interact. On the one hand, systems based on SQL provide ad-hoc rules for numerical nulls, on the other, theoretical research largely concentrates on the standard notions of certain and possible answers. In the presence of numerical attributes and aggregates, however, these answers are often meaningless, returning either too little or too much. Our goal is to define a principled framework for databases with numerical nulls and answering queries with arithmetic and aggregations over them. Towards this goal, we assume that missing values in numerical attributes are given by probability distributions associated with marked nulls. This yields a model of probabilistic bag databases in which tuples are not necessarily…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Logic, Reasoning, and Knowledge
