Investigating Document Type, Language, Publication Year, and Author Count Discrepancies Between OpenAlex and Web of Science
Philippe Mongeon, Madelaine Hare, Poppy Riddle, Summer Wilson, Geoff Krause, Rebecca Marjoram, and R\'emi Toupin

TL;DR
This study compares metadata quality between OpenAlex and Web of Science, focusing on discrepancies in document type, language, publication year, and author count to improve bibliometric data reliability.
Contribution
It provides a detailed analysis of metadata discrepancies between OpenAlex and WoS, highlighting data quality issues affecting bibliometric research and evaluation.
Findings
Significant discrepancies in document type classifications
Notable differences in publication year data
Variations in author count and language metadata
Abstract
Bibliometrics, whether used for research or research evaluation, relies on large multidisciplinary databases of research outputs and citation indices. The Web of Science (WoS) was the main supporting infrastructure of the field for more than 30 years until several new competitors emerged. OpenAlex, a bibliographic database launched in 2022, has distinguished itself for its openness and extensive coverage. While OpenAlex may reduce or eliminate barriers to accessing bibliometric data, one of the concerns that hinders its broader adoption for research and research evaluation is the quality of its metadata. This study aims to assess metadata quality in OpenAlex and WoS, focusing on document type, publication year, language, and number of authors. By addressing discrepancies and misattributions in metadata, this research seeks to enhance awareness of data quality issues that could impact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
