An LSH Index for Computing Kendall's Tau over Top-k Lists
Koninika Pal, Sebastian Michel

TL;DR
This paper introduces a locality sensitive hashing scheme tailored for Kendall's Tau distance to efficiently perform similarity search among top-k lists, addressing the limitations of traditional inverted indexing.
Contribution
It proposes a novel LSH index specifically designed for Kendall's Tau distance and evaluates its effectiveness on real-world datasets.
Findings
The LSH scheme improves search efficiency for top-k list similarities.
Experimental results demonstrate the method's effectiveness on real datasets.
The approach captures the Kendall's Tau similarity more accurately than inverted indexes.
Abstract
We consider the problem of similarity search within a set of top-k lists under the Kendall's Tau distance function. This distance describes how related two rankings are in terms of concordantly and discordantly ordered items. As top-k lists are usually very short compared to the global domain of possible items to be ranked, creating an inverted index to look up overlapping lists is possible but does not capture tight enough the similarity measure. In this work, we investigate locality sensitive hashing schemes for the Kendall's Tau distance and evaluate the proposed methods using two real-world datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Image and Video Retrieval Techniques · Geographic Information Systems Studies
