Diversifying Top-K Results

Lu Qin; Jeffrey Xu Yu; Lijun Chang

arXiv:1208.0076·cs.DB·August 2, 2012·2 cites

Diversifying Top-K Results

Lu Qin, Jeffrey Xu Yu, Lijun Chang

PDF

Open Access

TL;DR

This paper introduces a general framework and new algorithms for diversified top-k search, effectively reducing redundancy in search results by considering result similarity, and demonstrates high efficiency on large datasets.

Contribution

It proposes a flexible framework extending existing top-k solutions to diversified search, with three novel algorithms for optimal result selection.

Findings

01

div-cut algorithm finds optimal solutions in seconds for large k

02

Framework easily extends existing top-k methods to diversify results

03

Extensive experiments validate high efficiency and effectiveness

Abstract

Top-k query processing finds a list of k results that have largest scores w.r.t the user given query, with the assumption that all the k results are independent to each other. In practice, some of the top-k results returned can be very similar to each other. As a result some of the top-k results returned are redundant. In the literature, diversified top-k search has been studied to return k results that take both score and diversity into consideration. Most existing solutions on diversified top-k search assume that scores of all the search results are given, and some works solve the diversity problem on a specific problem and can hardly be extended to general cases. In this paper, we study the diversified top-k search problem. We define a general diversified top-k search problem that only considers the similarity of the search results themselves. We propose a framework, such that most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Image and Video Retrieval Techniques · Advanced Database Systems and Queries