Merits and Limits: Applying open data to monitor open access publications in bibliometric databases
Aliakbar Akbaritabar, Stephan Stahlschmidt

TL;DR
This study evaluates the challenges of accurately identifying open access publications in bibliometric databases by comparing multiple data sources and highlighting inconsistencies and limitations in current metadata and licensing information.
Contribution
It systematically analyzes the reliability of open access status assignment using various sources like Unpaywall, Crossref, DOAJ, and ROAD, revealing significant discrepancies and data gaps.
Findings
Only 50% of articles matched via DOI to Unpaywall.
Over 25% of OA status cases are contradictory across sources.
Approximately 17% of OA publications are not accessible as expected.
Abstract
Identifying and monitoring Open Access (OA) publications might seem a trivial task while practical efforts prove otherwise. Contradictory information arise often depending on metadata employed. We strive to assign OA status to publications in Web of Science (WOS) and Scopus while complementing it with different sources of OA information to resolve contradicting cases. We linked publications from WOS and Scopus via DOIs and ISSNs to Unpaywall, Crossref, DOAJ and ROAD. Only about 50% of articles and reviews from WOS and Scopus could be matched via a DOI to Unpaywall. Matching with Crossref brought 56 distinct licences, which define in many cases the legally binding access status of publications. But only 44% of publications hold only a single licence on Crossref, while more than 50% have no licence information submitted to Crossref. Contrasting OA information from Crossref licences with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research · Scientific Computing and Data Management · Biomedical Text Mining and Ontologies
