Anchor Prediction: Automatic Refinement of Internet Links
Nelson F. Liu, Kenton Lee, Kristina Toutanova

TL;DR
This paper introduces the task of anchor prediction to identify specific relevant parts of linked webpages, supported by new datasets and a T5-based baseline, aiming to improve how users find information within linked content.
Contribution
It defines the anchor prediction task, releases new datasets (AuthorAnchors and ReaderAnchors), and benchmarks a T5-based approach for the first time.
Findings
Effective anchor prediction requires reasoning over lengthy source and target pages.
The T5-based ranking approach provides a strong baseline but leaves room for improvement.
Datasets reflect both author and reader relevance judgments.
Abstract
Internet links enable users to deepen their understanding of a topic by providing convenient access to related information. However, the majority of links are unanchored -- they link to a target webpage as a whole, and readers may expend considerable effort localizing the specific parts of the target webpage that enrich their understanding of the link's source context. To help readers effectively find information in linked webpages, we introduce the task of anchor prediction, where the goal is to identify the specific part of the linked target webpage that is most related to the source linking context. We release the AuthorAnchors dataset, a collection of 34K naturally-occurring anchored links, which reflect relevance judgments by the authors of the source article. To model reader relevance judgments, we annotate and release ReaderAnchors, an evaluation set of anchors that readers find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Wikis in Education and Collaboration · Information Retrieval and Search Behavior
