Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts
\'Etienne Marcotte, Valentina Zantedeschi, Alexandre Drouin, Nicolas, Chapados

TL;DR
This paper investigates the reliability of proper scoring rules for evaluating multivariate probabilistic forecasts in finite samples, identifying conditions where they effectively detect forecasting errors and highlighting their limitations.
Contribution
It provides the first systematic finite-sample analysis of scoring rules, introducing the concept of the 'region of reliability' and assessing their performance through synthetic and real-world data.
Findings
Proper scoring rules have limited reliability outside certain conditions.
The study identifies specific scenarios where scoring rules fail to discriminate forecast errors.
Empirical results demonstrate critical shortcomings in current evaluation practices.
Abstract
Multivariate probabilistic time series forecasts are commonly evaluated via proper scoring rules, i.e., functions that are minimal in expectation for the ground-truth distribution. However, this property is not sufficient to guarantee good discrimination in the non-asymptotic regime. In this paper, we provide the first systematic finite-sample study of proper scoring rules for time-series forecasting evaluation. Through a power analysis, we identify the "region of reliability" of a scoring rule, i.e., the set of practical conditions where it can be relied on to identify forecasting errors. We carry out our analysis on a comprehensive synthetic benchmark, specifically designed to test several key discrepancies between ground-truth and forecast distributions, and we gauge the generalizability of our findings to real-world tasks with an application to an electricity production problem. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsForecasting Techniques and Applications · Energy Load and Power Forecasting · Stock Market Forecasting Methods
MethodsTest
