WeatherBench Probability: A benchmark dataset for probabilistic medium-range weather forecasting along with deep learning baseline models
Sagar Garg, Stephan Rasp, Nils Thuerey

TL;DR
WeatherBench Probability provides a comprehensive benchmark dataset and evaluation framework for probabilistic medium-range weather forecasting, enabling comparison of machine learning models with operational standards.
Contribution
It introduces probabilistic verification metrics and baseline models, including deep learning approaches, for the WeatherBench dataset, extending it to probabilistic forecasting.
Findings
Parametric and categorical models produce reliable forecasts.
Monte Carlo dropout underestimates uncertainty.
None of the models match the operational IFS model's skill.
Abstract
WeatherBench is a benchmark dataset for medium-range weather forecasting of geopotential, temperature and precipitation, consisting of preprocessed data, predefined evaluation metrics and a number of baseline models. WeatherBench Probability extends this to probabilistic forecasting by adding a set of established probabilistic verification metrics (continuous ranked probability score, spread-skill ratio and rank histograms) and a state-of-the-art operational baseline using the ECWMF IFS ensemble forecast. In addition, we test three different probabilistic machine learning methods -- Monte Carlo dropout, parametric prediction and categorical prediction, in which the probability distribution is discretized. We find that plain Monte Carlo dropout severely underestimates uncertainty. The parametric and categorical models both produce fairly reliable forecasts of similar quality. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrological Forecasting Using AI · Meteorological Phenomena and Simulations · Hydrology and Drought Analysis
MethodsMonte Carlo Dropout · Dropout
