Baseline: Operation-Based Evolution and Versioning of Data
Jonathan Edwards, Tomas Petricek

TL;DR
Baseline introduces an operation-based approach to data versioning that enables fine-grained diffing, merging, and schema evolution management across diverse data structures, simplifying collaboration and change tracking.
Contribution
It presents a novel operational differencing technique for managing structural data changes, enabling version control without repositories and supporting flexible sharing and schema evolution.
Findings
Supports fine-grained diffing and merging despite schema changes
Enables simplified version control with no repositories or branching complexity
Rewrites queries to adapt to schema changes automatically
Abstract
Baseline is a platform for richly structured data supporting change in multiple dimensions: mutation over time, collaboration across space, and evolution through design changes. It is built upon Operational Differencing, a new technique for managing data in terms of high-level operations that include refactorings and schema changes. We use operational differencing to construct an operation-based form of version control on data structures used in programming languages and relational databases. This approach to data version control does fine-grained diffing and merging despite intervening structural transformations like schema changes. It offers users a simplified conceptual model of version control for ad hoc usage: There is no repo; Branching is just copying. The informaton maintained in a repo can be synthesized more precisely from the append-only histories of branches. Branches can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Logic, programming, and type systems
