Data Assimilation with Machine Learning Surrogate Models: A Case Study with FourCastNet
Melissa Adrian, Daniel Sanz-Alonso, Rebecca Willett

TL;DR
This paper demonstrates that machine learning surrogate models like FourCastNet can be effectively used for long-term weather data assimilation, maintaining accuracy over extended periods despite model instability and observational sparsity.
Contribution
It introduces a novel integration of FourCastNet within a data assimilation framework, showing long-term accuracy and stability in weather forecasting.
Findings
Filtering estimates remain accurate over a year-long window.
Surrogate models can provide effective initial conditions for forecasting.
Long-term stability is achievable despite model and observational challenges.
Abstract
Modern data-driven surrogate models for weather forecasting provide accurate short-term predictions but inaccurate and nonphysical long-term forecasts. This paper investigates online weather prediction using machine learning surrogates supplemented with partial and noisy observations. We empirically demonstrate and theoretically justify that, despite the long-time instability of the surrogates and the sparsity of the observations, filtering estimates can remain accurate in the long-time horizon. As a case study, we integrate FourCastNet, a weather surrogate model, within a variational data assimilation framework using partial, noisy ERA5 data. Our results show that filtering estimates remain accurate over a year-long assimilation window and provide effective initial conditions for forecasting tasks, including extreme event prediction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data Technologies and Applications
