Assessing the reliability of ensemble forecasting systems under serial dependence
Jochen Br\"ocker

TL;DR
This paper develops new statistical tests to assess the reliability of ensemble forecasting systems by accounting for serial dependence in rank histograms, improving upon traditional methods that assume independence.
Contribution
The paper introduces tests that incorporate temporal correlations in rank histograms, providing a more accurate assessment of ensemble forecast reliability.
Findings
Ranks show strong decay of correlations over time.
Proposed tests are valid under minimal assumptions.
Traditional independence-based tests may be misleading.
Abstract
The problem of testing the reliability of ensemble forecasting systems is revisited. A popular tool to assess the reliability of ensemble forecasting systems (for scalar verifications) is the rank histogram, this histogram is expected to be more or less flat, since for a reliable ensemble, the ranks are uniformly distributed among their possible outcomes. Quantitative tests for flatness (e.g.\ Pearson's goodness--of--fit test) have been suggested, without exception though, these tests assume the ranks to be a sequence of independent random variables, which is not the case in general as can be demonstrated with simple toy examples. In this paper, tests are developed that take the temporal correlations between the ranks into account. A refined analysis shows that exploiting the reliability property, the ranks still exhibit strong decay of correlations. This property is key to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
