Large-Scale Query and XMatch, Entering the Parallel Zone
Maria A. Nieto-Santisteban, Aniruddha R. Thakar, Alexander S. Szalay,, Jim Gray

TL;DR
This paper demonstrates how using RDBMS and zoning algorithms can significantly improve large-scale astronomical data queries and cross-matching, enabling efficient handling of massive datasets in the Virtual Observatory framework.
Contribution
It introduces a zoning-based parallelization approach combined with RDBMS to enhance large-scale astronomical data querying and cross-matching capabilities.
Findings
Zoning algorithm improves parallel processing efficiency.
RDBMS-based approach enables handling billions of objects.
Performance tests validate scalability and speed improvements.
Abstract
Current and future astronomical surveys are producing catalogs with millions and billions of objects. On-line access to such big datasets for data mining and cross-correlation is usually as highly desired as unfeasible. Providing these capabilities is becoming critical for the Virtual Observatory framework. In this paper we present various performance tests that show how using Relational Database Management Systems (RDBMS) and a Zoning algorithm to partition and parallelize the computation, we can facilitate large-scale query and cross-match.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Mining Algorithms and Applications · Mobile Agent-Based Network Management
