Bootstrapped Grouping of Results to Ambiguous Person Name Queries
Toni Gruetze, Gjergji Kasneci, Zhe Zuo, Felix Naumann

TL;DR
This paper introduces a bootstrapped clustering method leveraging Web 2.0 platform data to improve disambiguation of search results for ambiguous person names, enhancing search accuracy and user experience.
Contribution
It proposes a novel approach that uses information from Web 2.0 platforms to effectively cluster search results for ambiguous person queries, addressing limitations of traditional ranking-based methods.
Findings
Achieved improved clustering accuracy on a dataset of 5,000 web pages.
Demonstrated effective disambiguation for 50 ambiguous person names.
Showed that leveraging Web 2.0 data enhances search result relevance.
Abstract
Some of the main ranking features of today's search engines reflect result popularity and are based on ranking models, such as PageRank, implicit feedback aggregation, and more. While such features yield satisfactory results for a wide range of queries, they aggravate the problem of search for ambiguous entities: Searching for a person yields satisfactory results only if the person we are looking for is represented by a high-ranked Web page and all required information are contained in this page. Otherwise, the user has to either reformulate/refine the query or manually inspect low-ranked results to find the person in question. A possible approach to solve this problem is to cluster the results, so that each cluster represents one of the persons occurring in the answer set. However clustering search results has proven to be a difficult endeavor by itself, where the clusters are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Web Data Mining and Analysis · Data Management and Algorithms
