A Multilingual Entity Linking System for Wikipedia with a Machine-in-the-Loop Approach
Martin Gerlach, Marshall Miller, Rita Ho, Kosta Harlan and, Djellel Difallah

TL;DR
This paper presents a multilingual entity linking system with a machine-in-the-loop approach to enhance Wikipedia's hyperlink coverage, especially in low-resource languages, by combining data-driven models with interactive editor feedback.
Contribution
It introduces a language-agnostic entity linking model integrated with an interactive interface for editors, improving link coverage and editor experience in Wikipedia.
Findings
Achieves over 80% precision in link recommendations.
Ensures at least 50% recall across six diverse languages.
Facilitates continuous evaluation through active editor feedback.
Abstract
Hyperlinks constitute the backbone of the Web; they enable user navigation, information discovery, content ranking, and many other crucial services on the Internet. In particular, hyperlinks found within Wikipedia allow the readers to navigate from one page to another to expand their knowledge on a given subject of interest or to discover a new one. However, despite Wikipedia editors' efforts to add and maintain its content, the distribution of links remains sparse in many language editions. This paper introduces a machine-in-the-loop entity linking system that can comply with community guidelines for adding a link and aims at increasing link coverage in new pages and wiki-projects with low-resources. To tackle these challenges, we build a context and language agnostic entity linking model that combines data collected from millions of anchors found across wiki-projects, as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
