Analyzing data citation practices using the Data Citation Index
Nicolas Robinson-Garcia, Evaristo Jim\'enez-Contreras, Daniel, Torres-Salinas

TL;DR
This study analyzes data citation practices across research fields using the Data Citation Index, revealing low overall citation rates but noting growth in specific disciplines and highlighting the potential for standardization to improve data attribution.
Contribution
It provides an empirical analysis of data citation practices using the DCI, highlighting field differences and the need for standardized citation methods.
Findings
88.1% of data records have no citations
Data citation practices vary by discipline
Growth observed in crystallography and genomics
Abstract
We present an analysis of data citation practices based on the Data Citation Index from Thomson Reuters. This database launched in 2012 aims to link data sets and data studies with citations received from the other citation indexes. The DCI harvests citations to research data from papers indexed in the Web of Science. It relies on the information provided by the data repository as data citation practices are inconsistent or inexistent in many cases. The findings of this study show that data citation practices are far from common in most research fields. Some differences have been reported on the way researchers cite data: while in the areas of Science and Engineering and Technology data sets were the most cited, in Social Sciences and Arts and Humanities data studies play a greater role. A total of 88.1 percent of the records have received no citation, but some repositories show very…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
