Approaches for Enriching and Improving Textual Knowledge Bases
Besnik Fetahu

TL;DR
This paper discusses automated methods to enhance Wikipedia's verifiability by updating citations and adding relevant news references to improve content completeness and accuracy.
Contribution
It introduces novel automated approaches for verifying citations and suggesting missing news references to enrich Wikipedia entries.
Findings
Improved citation accuracy and timeliness.
Enhanced coverage of news sources in Wikipedia.
Automated identification of missing references.
Abstract
Verifiability is one of the core editing principles in Wikipedia, where editors are encouraged to provide citations for the added statements. Statements can be any arbitrary piece of text, ranging from a sentence up to a paragraph. However, in many cases, citations are either outdated, missing, or link to non-existing references (e.g. dead URL, moved content etc.). In total, 20\% of the cases such citations refer to news articles and represent the second most cited source. Even in cases where citations are provided, there are no explicit indicators for the span of a citation for a given piece of text. In addition to issues related with the verifiability principle, many Wikipedia entity pages are incomplete, with relevant information that is already available in online news sources missing. Even for the already existing citations, there is often a delay between the news publication time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
