A Formal Category Theoretical Framework for Multi-model Data   Transformations

Valter Uotila; Jiaheng Lu

arXiv:2201.04905·cs.DB·January 14, 2022

A Formal Category Theoretical Framework for Multi-model Data Transformations

Valter Uotila, Jiaheng Lu

PDF

TL;DR

This paper develops a category theoretical framework for modeling data and schema transformations across various data models, providing a rigorous mathematical foundation for data integration and migration processes.

Contribution

It introduces a formal category theoretical approach, using functors and Kan lifts, to model data and schema transformations in relational, graph, and hierarchical data models.

Findings

01

Formal category theoretical foundations for data models

02

Representation of data instances as functors

03

Use of Kan lifts to model transformations

Abstract

Data integration and migration processes in polystores and multi-model database management systems highly benefit from data and schema transformations. Rigorous modeling of transformations is a complex problem. The data and schema transformation field is scattered with multiple different transformation frameworks, tools, and mappings. These are usually domain-specific and lack solid theoretical foundations. Our first goal is to define category theoretical foundations for relational, graph, and hierarchical data models and instances. Each data instance is represented as a category theoretical mapping called a functor. We formalize data and schema transformations as Kan lifts utilizing the functorial representation for the instances. A Kan lift is a category theoretical construction consisting of two mappings satisfying a certain universal property. In this work, the two mappings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.