Wikinformetrics: Construction and description of an open Wikipedia knowledge graph dataset for informetric purposes
Wenceslao Arroyo-Machado, Daniel Torres-Salinas, Rodrigo Costas

TL;DR
This paper introduces an open knowledge graph and a methodological framework for large-scale informetric analysis of Wikipedia, comparing its features with scientific publications and demonstrating its analytical potential through a case study.
Contribution
It presents a novel open Wikipedia knowledge graph and a comprehensive framework for informetric analysis, enabling new large-scale research opportunities.
Findings
Comparison of Wikipedia pages with scientific publications highlights similarities and differences.
A new set of metrics for analyzing Wikipedia from various dimensions.
A case study demonstrating the analytical potential of the dataset and metrics.
Abstract
Wikipedia is one of the most visited websites in the world and is also a frequent subject of scientific research. However, the analytical possibilities of Wikipedia information have not yet been analyzed considering at the same time both a large volume of pages and attributes. The main objective of this work is to offer a methodological framework and an open knowledge graph for the informetric large-scale study of Wikipedia. Features of Wikipedia pages are compared with those of scientific publications to highlight the (di)similarities between the two types of documents. Based on this comparison, different analytical possibilities that Wikipedia and its various data sources offer are explored, ultimately offering a set of metrics meant to study Wikipedia from different analytical dimensions. In parallel, a complete dedicated dataset of the English Wikipedia was built (and shared)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Cancer-related gene regulation
