Large-scale Biological Meta-database Management
Edvard Pedersen, Lars Ailo Bongo

TL;DR
The paper introduces GeStore, a system for managing large biological meta-databases that enables efficient storage, versioning, and updates, facilitating biological data analysis workflows amid rapidly growing data sizes.
Contribution
It presents GeStore, a novel system that efficiently manages large-scale biological meta-databases with version control and incremental updates, integrated seamlessly with analysis tools.
Findings
GeStore significantly reduces storage requirements for meta-databases.
It enables fast retrieval and updating of specific database versions.
The system improves workflow efficiency in biological data analysis.
Abstract
Up-to-date meta-databases are vital for the analysis of biological data. However,the current exponential increase in biological data leads to exponentially increasing meta-database sizes. Large-scale meta-database management is therefore an important challenge for production platforms providing services for biological data analysis. In particular, there is often a need either to run an analysis with a particular version of a meta-database, or to rerun an analysis with an updated meta-database. We present our GeStore approach for biological meta-database management. It provides efficient storage and runtime generation of specific meta-database versions, and efficient incremental updates for biological data analysis tools. The approach is transparent to the tools, and we provide a framework that makes it easy to integrate GeStore with biological data analysis frameworks. We present the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
