MigCast in Monte Carlo: The Impact of Data Model Evolution in NoSQL Databases
Andrea Hillenbrand, Uta St\"orl, Shamil Nabiyev, Stefanie Scherzinger

TL;DR
This paper investigates how data model evolution affects migration costs and latency in NoSQL databases, using Monte Carlo simulations to analyze various migration scenarios and inform software release strategies.
Contribution
It introduces a probabilistic Monte Carlo approach to evaluate the impact of schema changes on migration costs and performance in NoSQL databases, aiding decision-making.
Findings
Migration costs vary significantly with data access patterns and schema change types.
Monte Carlo sampling effectively manages the complexity of migration scenario analysis.
The study provides insights into optimizing release strategies to balance migration costs and latency.
Abstract
During the development of NoSQL-backed software, the data model evolves naturally alongside the application code. Especially in agile development, new application releases are deployed frequently causing schema changes. Eventually, decisions have to be made regarding the migration of versioned legacy data which is persisted in the cloud-hosted production database. We solve this schema evolution problem and present the results of near-exhaustive calculations by means of which software project stakeholders can manage the operative costs for data model evolution and adapt their software release strategy accordingly in order to comply with service-level agreements regarding the competing metrics of migration costs and latency. We clarify conclusively how data model evolution in NoSQL databases impacts the metrics while taking all relevant characteristics of migration scenarios into account.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Cloud Computing and Resource Management
