
TL;DR
This paper discusses the importance, principles, and methods of data citation, aiming to establish a comprehensive theory and practice to treat data as first-class research objects in science.
Contribution
It provides an integrated overview of the theoretical foundations and practical approaches for data citation, addressing a gap in the current fragmented landscape.
Findings
Highlights the need for a unified data citation framework
Reviews existing principles and computational methods for data citation
Proposes a comprehensive view combining theory and practice
Abstract
Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
