OrpheusDB: Bolt-on Versioning for Relational Databases
Silu Huang, Liqi Xu, Jialin Liu, Aaron Elmore, Aditya Parameswaran

TL;DR
OrpheusDB is a system that adds efficient version control to relational databases, enabling faster dataset versioning, querying, and management for data science teams through innovative data models and partitioning schemes.
Contribution
It introduces LyreSplit, a lightweight partitioning scheme, and demonstrates how to bolt versioning onto relational databases with significant performance improvements.
Findings
LyreSplit reduces query latency by up to 20x.
OrpheusDB is 1000x faster in finding effective partitionings.
Version retrieval latency is significantly reduced with partitioning.
Abstract
Data science teams often collaboratively analyze datasets, generating dataset versions at each stage of iterative exploration and analysis. There is a pressing need for a system that can support dataset versioning, enabling such teams to efficiently store, track, and query across dataset versions. We introduce OrpheusDB, a dataset version control system that "bolts on" versioning capabilities to a traditional relational database system, thereby gaining the analytics capabilities of the database "for free". We develop and evaluate multiple data models for representing versioned data, as well as a light-weight partitioning scheme, LyreSplit, to further optimize the models for reduced query latencies. With LyreSplit, OrpheusDB is on average 1000x faster in finding effective (and better) partitionings than competing approaches, while also reducing the latency of version retrieval by up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
