Open Reproducible Publication Research
Diomidis Spinellis

TL;DR
This paper introduces an open-source Python tool that enables reproducible, in-depth analysis of scientific publication data by utilizing open datasets, addressing transparency issues inherent in traditional database searches.
Contribution
It presents a new software package and command-line tool for reproducible bibliometric analysis using open data sources like Crossref and ORCID.
Findings
Analyzed publication trends across scientific fields.
Visualized relationships among publications and authors.
Replicated common bibliometric measures.
Abstract
Considerable scientific work involves locating, analyzing, systematizing, and synthesizing other publications. Its results end up in a paper's "background" section or in standalone articles, which include meta-analyses and systematic literature reviews. The required research is aided through the use of online scientific publication databases and search engines, such as Web of Science, Scopus, and Google Scholar. However, use of online databases suffers from a lack of repeatability and transparency, as well as from technical restrictions. Thankfully, open data, powerful personal computers, and open source software now make it possible to run sophisticated publication studies on the desktop in a self-contained environment that peers can readily reproduce. Here we report a Python software package and an associated command-line tool that can populate embedded relational databases with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Visualization and Analytics · Complex Network Analysis Techniques
