How do software citation formats evolve over time? A longitudinal analysis of R programming language packages
Yuzhuo Wang, Kai Li

TL;DR
This study analyzes how citation formats for R packages evolve over time, revealing changes in citation types and metadata, and providing insights for improving software citation practices and policies.
Contribution
It offers a longitudinal analysis of R package citations, highlighting evolving citation formats and metadata, and discusses implications for software citation standards.
Findings
Citation formats vary over time and include different document types.
Metadata elements in citations have changed significantly.
Disciplinarity of citing journal articles varies and impacts citation practices.
Abstract
Under the data-driven research paradigm, research software has come to play crucial roles in nearly every stage of scientific inquiry. Scholars are advocating for the formal citation of software in academic publications, treating it on par with traditional research outputs. However, software is hardly consistently cited: one software entity can be cited as different objects, and the citations can change over time. These issues, however, are largely overlooked in existing empirical research on software citation. To fill the above gaps, the present study compares and analyzes a longitudinal dataset of citation formats of all R packages collected in 2021 and 2022, in order to understand the citation formats of R-language packages, important members in the open-source software family, and how the citations evolve over time. In particular, we investigate the different document types…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Software Engineering Research · Software System Performance and Reliability
