ERMrest: an entity-relationship data storage service for web-based, data-oriented collaboration
Karl Czajkowski, Carl Kesselman, Robert Schuler, Hongsuda, Tangmunarunkit

TL;DR
ERMrest is a collaborative, RESTful data management service enabling entity-relationship modeling of scientific metadata, facilitating data sharing and integration across diverse research communities.
Contribution
It introduces ERMrest, a novel system that combines relational modeling with web-based collaboration for scientific data management, addressing limitations of existing systems.
Findings
Deployed to hundreds of users across multiple scientific communities.
Supported complex, evolving relationships in scientific metadata.
Enabled end-to-end scientific data lifecycle management.
Abstract
Scientific discovery is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. While the details vary from domain to domain, these data often consist of diverse digital assets (e.g. image files, sequence data, or simulation outputs) that are organized with complex relationships and context which may evolve over the course of an investigation. In addition, discovery is often collaborative, such that sharing of the data and its organizational context is highly desirable. Common systems for managing file or asset metadata hide their inherent relational structures, while traditional relational database systems do not extend to the distributed collaborative environment often seen in scientific investigations. To address these issues, we introduce ERMrest, a collaborative data management service which allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Distributed and Parallel Computing Systems
