TL;DR
This paper introduces a new randomization test for comparing forecast performance across various scoring functions, providing a flexible and statistically sound method that does not rely on strict assumptions about forecast dynamics.
Contribution
It develops a novel sign randomization testing framework for forecast dominance that is asymptotically valid under mild conditions and does not depend on specific forecast models.
Findings
Tests exhibit good size and power in simulations
Method is applicable across a range of scoring functions
Numerical experiments confirm practical effectiveness
Abstract
We propose randomization tests of whether forecast 1 outperforms forecast 2 across a class of scoring functions. This hypothesis is of applied interest: While the prediction context often prescribes a certain class of scoring functions, it is typically hard to motivate a specific choice on statistical or substantive grounds. We investigate the asymptotic behavior of the test statistics under mild conditions, avoiding the need to assume particular dynamic properties of forecasts and realizations. The properties of the one-sided tests depend on a corresponding version of Anderson's inequality, which we state as a conjecture of independent interest. Numerical experiments and a data example indicate that the tests have good size and power properties in practically relevant situations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
