Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia
Robert West, Ashwin Paranjape, Jure Leskovec

TL;DR
This paper presents a novel method for identifying missing hyperlinks in Wikipedia by analyzing human navigation traces collected through a game, aiming to improve the site's navigability.
Contribution
It introduces a new approach that leverages human navigation data to find and rank missing links, enhancing Wikipedia's connectivity beyond structural link analysis.
Findings
The method effectively identifies high-quality missing links.
Navigation traces improve link prediction accuracy.
Enhanced navigability through suggested links.
Abstract
Hyperlinks are an essential feature of the World Wide Web. They are especially important for online encyclopedias such as Wikipedia: an article can often only be understood in the context of related articles, and hyperlinks make it easy to explore this context. But important links are often missing, and several methods have been proposed to alleviate this problem by learning a linking model based on the structure of the existing links. Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia's navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
