Metadata Management in Scientific Computing

Eric L. Seidel

arXiv:1203.4135·cs.DL·March 20, 2012

Metadata Management in Scientific Computing

Eric L. Seidel

PDF

TL;DR

This paper proposes a new open, writable metadata management system for scientific computing datasets and codes, enabling community annotations and collaboration while maintaining data security.

Contribution

It introduces a novel approach using Fluidinfo for dynamic, social metadata management in scientific computing, exemplified with the Einstein Toolkit.

Findings

01

Fluidinfo enables open, writable metadata for scientific datasets

02

Community annotations improve data discoverability and collaboration

03

The system maintains data security through permissions

Abstract

Complex scientific codes and the datasets they generate are in need of a sophisticated categorization environment that allows the community to store, search, and enhance metadata in an open, dynamic system. Currently, data is often presented in a read-only format, distilled and curated by a select group of researchers. We envision a more open and dynamic system, where authors can publish their data in a writeable format, allowing users to annotate the datasets with their own comments and data. This would enable the scientific community to collaborate on a higher level than before, where researchers could for example annotate a published dataset with their citations. Such a system would require a complete set of permissions to ensure that any individual's data cannot be altered by others unless they specifically allow it. For this reason datasets and codes are generally presented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.