# How much is Wikipedia Lagging Behind News?

**Authors:** Besnik Fetahu, Abhijit Anand, Avishek Anand

arXiv: 1703.10345 · 2017-03-31

## TL;DR

This paper investigates the delay in Wikipedia's coverage of entities and events relative to news sources over 20 years, revealing insights into information flow and timeliness.

## Contribution

It provides a detailed analysis of the lag between news and Wikipedia for entities and events, highlighting patterns and the impact of news on Wikipedia content creation.

## Key findings

- 20% of entity pages reference news articles
- Entity lag follows a normal distribution with high variance
- Events often lead to the creation of new entities in Wikipedia

## Abstract

Wikipedia, rich in entities and events, is an invaluable resource for various knowledge harvesting, extraction and mining tasks. Numerous resources like DBpedia, YAGO and other knowledge bases are based on extracting entity and event based knowledge from it. Online news, on the other hand, is an authoritative and rich source for emerging entities, events and facts relating to existing entities. In this work, we study the creation of entities in Wikipedia with respect to news by studying how entity and event based information flows from news to Wikipedia.   We analyze the lag of Wikipedia (based on the revision history of the English Wikipedia) with 20 years of \emph{The New York Times} dataset (NYT). We model and analyze the lag of entities and events, namely their first appearance in Wikipedia and in NYT, respectively. In our extensive experimental analysis, we find that almost 20\% of the external references in entity pages are news articles encoding the importance of news to Wikipedia. Second, we observe that the entity-based lag follows a normal distribution with a high standard deviation, whereas the lag for news-based events is typically very low. Finally, we find that events are responsible for creation of emergent entities with as many as 12\% of the entities mentioned in the event page are created after the creation of the event page.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.10345/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1703.10345/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1703.10345/full.md

---
Source: https://tomesphere.com/paper/1703.10345