Specimens as research objects: reconciliation across distributed repositories to enable metadata propagation
Nicky Nicolson, Alan Paton, Sarah Phillips, Allan Tucker

TL;DR
This paper presents a method to connect and propagate metadata across distributed botanical specimen repositories, enabling better data reconciliation, collaboration, and recognition of scientific contributions.
Contribution
It introduces a data mining approach to identify duplicate specimens and facilitate metadata propagation across repositories, enhancing data integration and scholarly credit.
Findings
36% of specimens participate in duplication relationships
Metadata such as georeferences and images can be propagated among duplicates
Networks of repositories can be identified for collaboration
Abstract
Botanical specimens are shared as long-term consultable research objects in a global network of specimen repositories. Multiple specimens are generated from a shared field collection event; generated specimens are then managed individually in separate repositories and independently augmented with research and management metadata which could be propagated to their duplicate peers. Establishing a data-derived network for metadata propagation will enable the reconciliation of closely related specimens which are currently dispersed, unconnected and managed independently. Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
