The hunt for research data: Development of an open-source workflow for tracking institutionally-affiliated research data publications
Bryan M. Gee

TL;DR
This paper presents an open-source workflow for tracking institutionally-affiliated research data publications across multiple platforms to improve data findability and support research data stewardship.
Contribution
It introduces a multi-API workflow for institutional data discovery, addressing challenges of inconsistent metadata and expanding the ability to find datasets regardless of DOI or affiliation metadata.
Findings
Retrieved over 4,000 datasets across 70 platforms
Identified major gaps due to inconsistent metadata practices
Demonstrated feasibility of multi-API institutional data tracking
Abstract
The ability to find data is central to the FAIR principles underlying research data stewardship. As with the ability to reuse data, efforts to ensure and enhance findability have historically focused on discoverability of data by other researchers, but there is a growing recognition of the importance of stewarding data in a fashion that makes them FAIR for a wide range of potential reusers and stakeholders. Research institutions are one such stakeholder and have a range of motivations for discovering data, specifically those affiliated with a focal institution, from facilitating compliance with funder provisions to gathering data to inform research data services. However, many research datasets and repositories are not optimized for institutional discovery (e.g., not recording or standardizing affiliation metadata), which creates downstream obstacles to workflows designed for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsResearch Data Management Practices · Scientific Computing and Data Management · Academic Publishing and Open Access
