Mining Meaning from Wikipedia
Olena Medelyan, David Milne, Catherine Legg, Ian H. Witten

TL;DR
Wikipedia serves as a valuable resource for extracting concepts, relations, and facts, supporting diverse NLP and information retrieval tasks, while researchers enhance and adapt it for ontology building and new resource creation.
Contribution
This paper comprehensively reviews research leveraging Wikipedia for NLP, information retrieval, and ontology development, highlighting recent advancements and open-source tools.
Findings
Wikipedia is widely used in NLP and information retrieval.
Research has focused on extracting and organizing knowledge from Wikipedia.
Open-source software supports various applications of Wikipedia data.
Abstract
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Natural Language Processing Techniques · Topic Modeling
