Mining Knowledge in Astrophysical Massive Data Sets
M. Brescia, G. Longo, F. Pasian

TL;DR
This paper discusses the challenges and progress in applying data mining techniques to massive, heterogeneous astronomical datasets within the Virtual Observatory framework, enabling advanced multi-wavelength scientific analysis.
Contribution
It summarizes the current status and future plans for integrating advanced data mining methodologies into astronomical massive data sets through the DAME project.
Findings
Progress in federating astronomical archives under common standards
Development of scalable data mining algorithms for MDS
Enhancement of multi-wavelength, multi-epoch scientific analysis
Abstract
Modern scientific data mainly consist of huge datasets gathered by a very large number of techniques and stored in very diversified and often incompatible data repositories. More in general, in the e-science environment, it is considered as a critical and urgent requirement to integrate services across distributed, heterogeneous, dynamic "virtual organizations" formed by different resources within a single enterprise. In the last decade, Astronomy has become an immensely data rich field due to the evolution of detectors (plates to digital to mosaics), telescopes and space instruments. The Virtual Observatory approach consists into the federation under common standards of all astronomical archives available worldwide, as well as data analysis, data mining and data exploration applications. The main drive behind such effort being that once the infrastructure will be completed, it will…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
