The Sloan Digital Sky Survey Science Archive: Migrating a Multi-Terabyte Astronomical Archive from Object to Relational DBMS
Aniruddha R. Thakar, Alexander S. Szalay, Peter Z. Kunszt, Jim Gray

TL;DR
The paper discusses the challenges faced in managing a multi-terabyte astronomical data archive, initially using an object-oriented database, and the subsequent migration to a relational database to improve query support and data mining capabilities.
Contribution
It provides a detailed case study of migrating a large-scale scientific archive from object-oriented to relational database technology, highlighting technical challenges and solutions.
Findings
Object database performance was insufficient for data mining needs.
Relational database migration improved query support and performance.
Vendor support limitations influenced the migration decision.
Abstract
The Sloan Digital Sky Survey Science Archive is the first in a series of multi-Terabyte digital archives in Astronomy and other data-intensive sciences. To facilitate data mining in the SDSS archive, we adapted a commercial database engine and built specialized tools on top of it. Originally we chose an object-oriented database management system due to its data organization capabilities, platform independence, query performance and conceptual fit to the data. However, after using the object database for the first couple of years of the project, it soon began to fall short in terms of its query support and data mining performance. This was as much due to the inability of the database vendor to respond our demands for features and bug fixes as it was due to their failure to keep up with the rapid improvements in hardware performance, particularly faster RAID disk systems. In the end, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Mining Algorithms and Applications
