DataJoint: A Simpler Relational Data Model
Dimitri Yatsenko, Edgar Y. Walker, Andreas S. Tolias

TL;DR
DataJoint introduces a simplified, normalized relational data model with a dedicated language and visualization tools, enhancing usability in scientific data pipelines, especially in neuroscience.
Contribution
It presents a refined relational model with a new query language and visualization, making relational databases more accessible for scientific data management.
Findings
Adoption in neuroscience labs demonstrates practical utility.
The model simplifies relational data handling for scientific applications.
The query language offers clear, normalized data operations.
Abstract
The relational data model offers unrivaled rigor and precision in defining data structure and querying complex data. Yet the use of relational databases in scientific data pipelines is limited due to their perceived unwieldiness. We propose a simplified and conceptually refined relational data model named DataJoint. The model includes a language for schema definition, a language for data queries, and diagramming notation for visualizing entities and relationships among them. The model adheres to the principle of entity normalization, which requires that all data -- both stored and derived -- must be represented by well-formed entity sets. DataJoint's data query language is an algebra on entity sets with five operators that provide matching capabilities to those of other relational query languages with greater clarity due to entity normalization. Practical implementations of DataJoint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Data Management and Algorithms
