Answering Top-k Queries Over a Mixture of Attractive and Repulsive Dimensions
Sayan Ranu, Ambuj K. Singh

TL;DR
This paper introduces a novel top-k query framework that combines attractive and repulsive dimensions in a scoring function, enabling more effective data retrieval and revealing hidden data characteristics.
Contribution
It proposes a new scoring function integrating attractive and repulsive dimensions, along with scalable index structures and empirical validation showing significant performance improvements.
Findings
Performance gain of 10-100x over existing methods
Effective discovery of hidden data characteristics
Versatile application scenarios
Abstract
In this paper, we formulate a top-k query that compares objects in a database to a user-provided query object on a novel scoring function. The proposed scoring function combines the idea of attractive and repulsive dimensions into a general framework to overcome the weakness of traditional distance or similarity measures. We study the properties of the proposed class of scoring functions and develop efficient and scalable index structures that index the isolines of the function. We demonstrate various scenarios where the query finds application. Empirical evaluation demonstrates a performance gain of one to two orders of magnitude on querying time over existing state-of-the-art top-k techniques. Further, a qualitative analysis is performed on a real dataset to highlight the potential of the proposed query in discovering hidden data characteristics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Automated Road and Building Extraction · Advanced Image and Video Retrieval Techniques
