Scalable Probabilistic Similarity Ranking in Uncertain Databases   (Technical Report)

Thomas Bernecker; Hans-Peter Kriegel; Nikos Mamoulis; Matthias Renz; and Andreas Zuefle

arXiv:0907.2868·cs.DB·July 17, 2009

Scalable Probabilistic Similarity Ranking in Uncertain Databases (Technical Report)

Thomas Bernecker, Hans-Peter Kriegel, Nikos Mamoulis, Matthias Renz, and Andreas Zuefle

PDF

Open Access

TL;DR

This paper presents a scalable, linear-time framework for probabilistic top-k similarity ranking in uncertain vector data, significantly improving efficiency over previous quadratic approaches.

Contribution

It introduces an incremental, linear-time algorithm for probabilistic ranking that maintains accuracy and reduces computational complexity.

Findings

01

Achieves linear-time complexity for probabilistic ranking

02

Demonstrates efficiency on synthetic and real datasets

03

Maintains same memory requirements as previous methods

Abstract

This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to rank the uncertain data according to their distance to a reference object. We propose a framework that incrementally computes for each object instance and ranking position, the probability of the object falling at that ranking position. The resulting rank probability distribution can serve as input for several state-of-the-art probabilistic ranking models. Existing approaches compute this probability distribution by applying a dynamic programming approach of quadratic complexity. In this paper we theoretically as well as experimentally show that our framework reduces this to a linear-time complexity while having the same memory requirements,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Database Systems and Queries · Time Series Analysis and Forecasting