The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
Ruben Ohana, Michael McCabe, Lucas Meyer, Rudy Morel, Fruzsina J., Agocs, Miguel Beneitez, Marsha Berger, Blakesley Burkhart, Keaton Burns,, Stuart B. Dalziel, Drummond B. Fielding, Daniel Fortunato, Jared A. Goldberg,, Keiya Hirashima, Yan-Fei Jiang, Rich R. Kerswell

TL;DR
The paper introduces 'The Well,' a comprehensive collection of large-scale, diverse physics simulation datasets designed to advance machine learning surrogate models and provide a broad benchmark for evaluating new approaches.
Contribution
It provides a large, diverse, and unified dataset collection with a PyTorch interface, addressing limitations of existing small, domain-specific datasets in physics-based machine learning.
Findings
Demonstrated the utility of the dataset with baseline models.
Highlighted new challenges in modeling complex physical dynamics.
Provided open access to data and tools for the community.
Abstract
Machine learning based surrogate models offer researchers powerful tools for accelerating simulation-based workflows. However, as standard datasets in this space often cover small classes of physical behavior, it can be difficult to evaluate the efficacy of new approaches. To address this gap, we introduce the Well: a large-scale collection of datasets containing numerical simulations of a wide variety of spatiotemporal physical systems. The Well draws from domain experts and numerical software developers to provide 15TB of data across 16 datasets covering diverse domains such as biological systems, fluid dynamics, acoustic scattering, as well as magneto-hydrodynamic simulations of extra-galactic fluids or supernova explosions. These datasets can be used individually or as part of a broader benchmark suite. To facilitate usage of the Well, we provide a unified PyTorch interface for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- polymathic-ai/acoustic_scattering_discontinuousdataset· 237 dl237 dl
- polymathic-ai/helmholtz_staircasedataset· 102 dl102 dl
- polymathic-ai/acoustic_scattering_inclusionsdataset· 174 dl174 dl
- polymathic-ai/acoustic_scattering_mazedataset· 137 dl137 dl
- polymathic-ai/post_neutron_star_mergerdataset· 100 dl100 dl
Videos
Taxonomy
TopicsScientific Computing and Data Management · Computational Physics and Python Applications · Advanced Data Storage Technologies
MethodsLib
