Ensuring Adherence to Standards in Experiment-Related Metadata Entered Via Spreadsheets
Martin J. O'Connor, Josef Hardi, Marcos Mart\'inez-Romero, Sowmya Somasundaram, Brendan Honick, Stephen A. Fisher, Ajay Pillai, Mark A. Musen

TL;DR
This paper presents an integrated system that enables scientists to enter experiment metadata via spreadsheets while ensuring compliance with community standards and improving data quality through validation tools.
Contribution
It introduces a comprehensive approach combining customizable templates, controlled vocabularies, and an interactive web tool to enhance spreadsheet-based metadata entry and validation.
Findings
Successful deployment in the HuBMAP consortium
Improved metadata consistency and standard adherence
Enhanced error detection and correction in spreadsheets
Abstract
Scientists increasingly recognize the importance of providing rich, standards-adherent metadata to describe their experimental results. Despite the availability of sophisticated tools to assist in the process of data annotation, investigators generally seem to prefer to use spreadsheets when supplying metadata, despite the limitations of spreadsheets in ensuring metadata consistency and compliance with formal specifications. In this paper, we describe an end-to-end approach that supports spreadsheet-based entry of metadata, while ensuring rigorous adherence to community-based metadata standards and providing quality control. Our methods employ several key components, including customizable templates that represent metadata standards and that can inform the spreadsheets that investigators use to author metadata, controlled terminologies and ontologies for defining metadata values that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Data Quality and Management
