Quantifying the consistency of scientific databases
Lovro \v{S}ubelj, Marko Bajec, Biljana Mileva Boshkoska, Andrej, Kastrin, Zoran Levnaji\'c

TL;DR
This paper systematically analyzes the consistency among six major scientific databases using complex network frameworks, revealing significant differences that impact bibliometric research.
Contribution
It provides a novel systematic comparison of scientific databases' consistency, highlighting challenges in identifying a single authoritative source.
Findings
Significant differences in database consistency
No single 'best' database identified
Implications for future bibliometric studies
Abstract
Science is a social process with far-reaching impact on our modern society. In the recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
