# Categorical Data Integration for Computational Science

**Authors:** Kristopher Brown, David I. Spivak, Ryan Wisnesky

arXiv: 1903.10579 · 2019-03-27

## TL;DR

This paper introduces Categorical Query Language (CQL), a data integration tool that preserves data structure during migration, enhancing data sharing and interpretation in computational science, demonstrated through quantum materials database integration.

## Contribution

The paper presents CQL as a novel, structure-preserving data migration language that addresses data sharing challenges in computational science.

## Key findings

- CQL effectively integrates diverse materials databases.
- Data migrations preserve data structure and prevent misinterpretation.
- Demonstrates practical application in quantum materials data integration.

## Abstract

Categorical Query Language is an open-source query and data integration scripting language that can be applied to common challenges in the field of computational science. We discuss how the structure-preserving nature of CQL data migrations protect those who publicly share data from the misinterpretation of their data. Likewise, this feature of CQL migrations allows those who draw from public data sources to be sure only data which meets their specification will actually be transferred. We argue some open problems in the field of data sharing in computational science are addressable by working within this paradigm of functorial data migration. We demonstrate these tools by integrating data from the Open Quantum Materials Database with some alternative materials databases.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.10579/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1903.10579/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1903.10579/full.md

---
Source: https://tomesphere.com/paper/1903.10579