Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic
Martijn Visser, Nees Jan van Eck, and Ludo Waltman

TL;DR
This study provides a comprehensive comparison of five major bibliographic data sources, analyzing their coverage, citation link accuracy, and strengths to inform better literature retrieval strategies.
Contribution
It offers the first large-scale, detailed comparison of these data sources, highlighting their differences and suggesting optimal combinations for research.
Findings
Significant differences in document coverage over time and disciplines.
Variations in citation link completeness and accuracy.
Recommendations for combining sources for comprehensive literature searches.
Abstract
We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008-2017 covered by these data sources. Scopus is compared in a pairwise manner with each of the other data sources. We first analyze differences between the data sources in the coverage of documents, focusing for instance on differences over time, differences per document type, and differences per discipline. We then study differences in the completeness and accuracy of citation links. Based on our analysis, we discuss strengths and weaknesses of the different data sources. We emphasize the importance of combining a comprehensive coverage of the scientific literature with a flexible set of filters for making selections of the literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research
