A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases
Thomas Bernecker, Tobias Emrich, Hans-Peter Kriegel, Nikos Mamoulis,, Matthias Renz, Andreas Zuefle

TL;DR
This paper introduces a new probabilistic pruning method that efficiently speeds up similarity queries in uncertain databases by estimating domination counts with tight probability bounds, supporting complex uncertainty models.
Contribution
It presents a novel geometric pruning filter and an iterative filter-refinement strategy for probabilistic domination count estimation in uncertain data.
Findings
Enables fast computation of probability bounds for domination counts
Supports general continuous probabilistic models including correlations
Effective on large uncertain databases
Abstract
In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for probabilistic similarity queries on uncertain data. Our approach supports a general uncertainty model using continuous probabilistic density functions to describe the (possibly correlated) uncertain attributes of objects. In a nutshell, the problem to be solved is to compute the PDF of the random variable denoted by the probabilistic domination count: Given an uncertain database object B, an uncertain reference object R and a set D of uncertain database objects in a multi-dimensional space, the probabilistic domination count denotes the number of uncertain objects in D that are closer to R than B. This domination count can be used to answer a wide range of probabilistic similarity queries. Specifically, we propose a novel geometric pruning filter and introduce an iterative filter-refinement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Constraint Satisfaction and Optimization
