A Formal Category Theoretical Framework for Multi-model Data Transformations
Valter Uotila, Jiaheng Lu

TL;DR
This paper develops a category theoretical framework for modeling data and schema transformations across various data models, providing a rigorous mathematical foundation for data integration and migration processes.
Contribution
It introduces a formal category theoretical approach, using functors and Kan lifts, to model data and schema transformations in relational, graph, and hierarchical data models.
Findings
Formal category theoretical foundations for data models
Representation of data instances as functors
Use of Kan lifts to model transformations
Abstract
Data integration and migration processes in polystores and multi-model database management systems highly benefit from data and schema transformations. Rigorous modeling of transformations is a complex problem. The data and schema transformation field is scattered with multiple different transformation frameworks, tools, and mappings. These are usually domain-specific and lack solid theoretical foundations. Our first goal is to define category theoretical foundations for relational, graph, and hierarchical data models and instances. Each data instance is represented as a category theoretical mapping called a functor. We formalize data and schema transformations as Kan lifts utilizing the functorial representation for the instances. A Kan lift is a category theoretical construction consisting of two mappings satisfying a certain universal property. In this work, the two mappings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
