TL;DR
This paper investigates the use of copula-based models to generate synthetic data for augmenting training datasets in weather and climate emulators, demonstrating significant improvements in prediction accuracy.
Contribution
It introduces a copula-based data augmentation method specifically for enhancing machine-learning emulators in weather and climate modeling.
Findings
Prediction error reduced by up to 62%.
Copula-augmented datasets outperform original data.
Method effective on toy physical models.
Abstract
Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
