A multi-level analysis of data quality for formal software citation
David Schindler, Tazin Hossain, Sascha Spors, Frank Kr\"uger

TL;DR
This paper critically evaluates formal software citation practices, revealing that current standards and bibliographic representations hinder accurate software identification and impact assessment in scientific research.
Contribution
It provides an in-depth analysis of formal software citations, highlighting deficiencies in current practices and proposing the need for improved bibliographic modeling of software references.
Findings
Software articles are the most cited resource for software.
Direct software citations better identify software versions.
Current practices hinder large-scale software impact analysis.
Abstract
Software is a central part of modern science, and knowledge of its use is crucial for the scientific community with respect to reproducibility and attribution of its developers. Several studies have investigated in-text mentions of software and its quality, while the quality of formal software citations has only been analyzed superficially. This study performs an in-depth evaluation of formal software citation based on a set of manually annotated software references. It examines which resources are cited for software usage, to what extend they allow proper identification of software and its specific version, how this information is made available by scientific publishers, and how well it is represented in large-scale bibliographic databases. The results show that software articles are the most cited resource for software, while direct software citations are better suited for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · scientometrics and bibliometrics research · Research Data Management Practices
