scores: A Python package for verifying and evaluating models and predictions with xarray
Tennessee Leeuwenburg, Nicholas Loveday, Elizabeth E. Ebert, Harrison, Cook, Mohammadreza Khanarmuei, Robert J. Taggart, Nikeeth Ramanathan, Maree, Carroll, Stephanie Chong, Aidan Griffiths, John Sharples

TL;DR
The paper introduces 'scores', a comprehensive Python package for verifying and evaluating models using advanced metrics and statistical tests, supporting multidimensional Earth system data with scalable performance.
Contribution
It presents a new Python package that includes novel and complex scores, statistical tests, and data processing tools tailored for geoscience model verification.
Findings
Includes over 50 metrics and tools for forecast verification.
Supports multidimensional data formats like NetCDF, HDF5, Zarr, and GRIB.
Provides tutorials and thorough scientific review of all features.
Abstract
`scores` is a Python package containing mathematical functions for the verification, evaluation and optimisation of forecasts, predictions or models. It supports labelled n-dimensional (multidimensional) data, which is used in many scientific fields and in machine learning. At present, `scores` primarily supports the geoscience communities; in particular, the meteorological, climatological and oceanographic communities. `scores` not only includes common scores (e.g., Mean Absolute Error), it also includes novel scores not commonly found elsewhere (e.g., FIxed Risk Multicategorical (FIRM) score, Flip-Flop Index), complex scores (e.g., threshold-weighted continuous ranked probability score), and statistical tests (such as the Diebold Mariano test). It also contains isotonic regression which is becoming an increasingly important tool in forecast verification and can be used to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
