GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI
Shiqian Li, Zhi Li, Zhancun Mu, Shiji Xin, Zhixiang Dai, Kuangdai Leng, Ruihua Zhang, Xiaodong Song, Yixin Zhu

TL;DR
This paper introduces GlobalTomo, a comprehensive 3D synthetic dataset for seismic wavefield modeling and FWI, demonstrating ML's potential to accelerate global seismic tomography and improve earth interior understanding.
Contribution
It provides the first large-scale 3D synthetic dataset for global seismic FWI, integrating physics-based modeling with ML benchmarks to enhance computational efficiency.
Findings
ML approaches are suitable for global FWI.
The dataset enables rapid forward modeling.
ML improves inversion flexibility.
Abstract
Global seismic tomography, taking advantage of seismic waves from natural earthquakes, provides essential insights into the earth's internal dynamics. Advanced Full-waveform Inversion (FWI) techniques, whose aim is to meticulously interpret every detail in seismograms, confront formidable computational demands in forward modeling and adjoint simulations on a global scale. Recent advancements in Machine Learning (ML) offer a transformative potential for accelerating the computational efficiency of FWI and extending its applicability to larger scales. This work presents the first 3D global synthetic dataset tailored for seismic wavefield modeling and full-waveform tomography, referred to as the GlobalTomo dataset. This dataset is uniquely comprehensive, incorporating explicit wave physics and robust geophysical parameterization at realistic global scales, generated through…
Peer Reviews
Decision·Submitted to ICLR 2025
This paper addresses a key gap by providing a global-scale dataset that integrates ML-friendly features with robust physical modeling for seismic applications. The dataset’s design across three tiers (acoustic, elastic, and real Earth) allows it to support scalable and complex seismic modeling tasks.
1. Wavefield Estimation. The necessity of including the wavefield as part of the dataset is not entirely convincing. While the authors provide further discussion on this in Supplementary Section F.2, this inclusion could be misleading for two primary reasons. First, from a practical inverse problem perspective, only surface wavefield measurements (seismograms) are typically available in real-world FWI applications. The full wavefield throughout the Earth’s interior is generally inaccessible and
- The dataset itself is good and could benefit the community a lot. - The paper is well-written and easy to follow. - The figures and tables are clear.
I think the machine learning models are kind of simple. Based on my experience, MLPs seem not to be the best choices for 3D problems compared with other methods you chose in the paper (CNN or transformer). But the results show that MLP / InversionMLP performs best in most cases. I am concerned whether all methods are well-trained and under the best hyperparameters.
1. The idea to generate a large-scale dataset for global seismic tomography is interesting and essential to bridge the gap between machine learning and seismic tomography. 2. The proposed dataset has three tiers to model seismic wave propagation from simple to complex settings and for each tier the dataset contains seismic wavefield and seismogram information. 3. The paper explores multiple ML methods to establish a benchmark for both forward and inverse problems in seismic tomography.
1. Although the paper is overall well-written, there are certain sections which is little obscure like model configuration, spherical harmonics, and data generation. These sections need improvement to understand how the dataset was generated, model training and inference results. (a) For generating different examples, do we always perturb the same 1D background model or different? How is the data generation algorithm create different varying geological settings in the dataset? How do we measure
Code & Models
Videos
Taxonomy
TopicsSeismology and Earthquake Studies · Seismic Imaging and Inversion Techniques · Geological Modeling and Analysis
