A Novel Diffusion Model for Pairwise Geoscience Data Generation with Unbalanced Training Dataset
Junhuan Yang, Yuzhou Zhang, Yi Sheng, Youzuo Lin, Lei Yang

TL;DR
This paper introduces UB-Diff, a diffusion model designed for generating multi-modal pairwise scientific data, effectively handling unbalanced datasets in geoscience applications like seismic imaging.
Contribution
The paper presents a novel encoder-decoder diffusion model that leverages co-latent representations to generate paired multi-modal data from unbalanced datasets, improving over existing methods.
Findings
UB-Diff outperforms existing models in FID scores.
It produces reliable multi-modal pairwise data.
Experimental results validate its effectiveness on OpenFWI dataset.
Abstract
Recently, the advent of generative AI technologies has made transformational impacts on our daily lives, yet its application in scientific applications remains in its early stages. Data scarcity is a major, well-known barrier in data-driven scientific computing, so physics-guided generative AI holds significant promise. In scientific computing, most tasks study the conversion of multiple data modalities to describe physical phenomena, for example, spatial and waveform in seismic imaging, time and frequency in signal processing, and temporal and spectral in climate modeling; as such, multi-modal pairwise data generation is highly required instead of single-modal data generation, which is usually used in natural images (e.g., faces, scenery). Moreover, in real-world applications, the unbalance of available data in terms of modalities commonly exists; for example, the spatial data (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Healthcare · Big Data Technologies and Applications · Advanced Clustering Algorithms Research
MethodsDiffusion
