Thalia: A Global, Multi-Modal Dataset for Volcanic Activity Monitoring
Nikolas Papadopoulos, Nikolaos Ioannis Bountos, Maria Sdraka, Andreas Karavias, Gustau Camps-Valls, Ioannis Papoutsis

TL;DR
Thalia is a comprehensive, multi-modal dataset combining satellite, topographic, and atmospheric data over 7 years, designed to improve automated volcanic activity monitoring through machine learning.
Contribution
The paper introduces Thalia, a large-scale, multi-source dataset with expert annotations, addressing previous data limitations and enabling advanced ML-based volcanic deformation analysis.
Findings
Benchmark results demonstrate state-of-the-art models effectively classify volcanic deformation.
Thalia enables improved detection and segmentation of volcanic activity signals.
The dataset fosters collaboration between machine learning and Earth science communities.
Abstract
Monitoring volcanic activity is of paramount importance to safeguarding lives, infrastructure, and ecosystems. However, only a small fraction of known volcanoes are continuously monitored. Satellite-based Interferometric Synthetic Aperture Radar (InSAR) enables systematic, global-scale deformation monitoring. However, its complex data challenge traditional remote sensing methods. Deep learning offers a powerful means to automate and enhance InSAR interpretation, advancing volcanology and geohazard assessment. Despite its promise, progress has been limited by the scarcity of well-curated datasets. In this work, we build on the existing Hephaestus dataset and introduce Thalia, addressing crucial limitations and enriching its scope with higher-resolution, multi-source, and multi-temporal data. Thalia is a global collection of 38 spatiotemporal datacubes covering 7 years and integrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
