Anchor Prediction: A Topic Modeling Approach
Jean Dupuy, Adrien Guille, Julien Jacques

TL;DR
This paper introduces CRTM, a novel topic modeling approach for automatically predicting hyperlinks in documents by modeling local context and content, improving hyperlink annotation without external resources.
Contribution
The paper presents CRTM, a new relational topic model specifically designed for anchor prediction, addressing a unique task distinct from traditional link prediction.
Findings
CRTM effectively predicts anchors in Wikipedia articles across multiple languages.
The model outperforms baseline methods in anchor prediction accuracy.
Experiments demonstrate practical usefulness in real-world document networks.
Abstract
Networks of documents connected by hyperlinks, such as Wikipedia, are ubiquitous. Hyperlinks are inserted by the authors to enrich the text and facilitate the navigation through the network. However, authors tend to insert only a fraction of the relevant hyperlinks, mainly because this is a time consuming task. In this paper we address an annotation, which we refer to as anchor prediction. Even though it is conceptually close to link prediction or entity linking, it is a different task that require developing a specific method to solve it. Given a source document and a target document, this task consists in automatically identifying anchors in the source document, i.e words or terms that should carry a hyperlink pointing towards the target document. We propose a contextualized relational topic model, CRTM, that models directed links between documents as a function of the local context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
