Extreme Weather Bench: A framework and benchmark for evaluation of high-impact weather
Amy McGovern, Taylor Mandelbaum, Daniel Rothenberg, Nicholas Loveday, Corey Potvin, Montgomery Flora, Linus Magnusson, Eric Gilleland, John Allen

TL;DR
Extreme Weather Bench (EWB) is an open-source benchmark suite designed to evaluate high-impact weather prediction models across various scales and phenomena, promoting standardized validation and comparison.
Contribution
The paper introduces EWB, a comprehensive community-driven benchmark suite with case studies, data, metrics, and code for evaluating high-impact weather models.
Findings
EWB provides a standard set of case studies for high-impact weather events.
EWB enables model validation across multiple spatial and temporal scales.
EWB promotes transparent comparison of weather prediction models.
Abstract
Forecasting the wide variety of high-impact weather events experienced globally is a challenge for both Artificial Intelligence (AI) and Numerical Weather Prediction (NWP) models and it is critical that such models be properly verified before deployment. Although AI weather models are rapidly evolving, much of their evaluation is currently done either with a global-scale evaluation or by hand-picking a small number of case studies or a region. A widely-used open-source benchmark suite focusing on high-impact weather will help to drive the science forward for all scales of weather models, as it has for other AI fields. Here we introduce Extreme Weather Bench (EWB), a new community-driven benchmark suite that facilitates model validation and verification on a variety of high-impact hazards that matter to people around the globe. EWB provides a standard set of case studies (spanning across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
