An Imbalance-Robust Evaluation Framework for Extreme Risk Forecasts
Sotirios D. Nikolopoulos

TL;DR
This paper introduces a new set of evaluation metrics for rare-event forecasts that remain stable and interpretable even at extremely low event prevalences, addressing limitations of traditional metrics.
Contribution
The authors develop rare-event-stable (RES) metrics that maintain stable thresholds and consistent rankings as event rarity increases, improving evaluation of extreme risk forecasts.
Findings
RES metrics maintain stable thresholds at very low event probabilities
Traditional metrics exhibit threshold drift and collapse under extreme rarity
Application to credit-default prediction confirms RES metrics' robustness and interpretability
Abstract
Evaluating rare-event forecasts is challenging because standard metrics collapse as event prevalence declines. Measures such as F1-score, AUPRC, MCC, and accuracy induce degenerate thresholds -- converging to zero or one -- and their values become dominated by class imbalance rather than tail discrimination. We develop a family of rare-event-stable (RES) metrics whose optimal thresholds remain strictly interior as the event probability approaches zero, ensuring coherent decision rules under extreme rarity. Simulations spanning event probabilities from 0.01 down to one in a million show that RES metrics maintain stable thresholds, consistent model rankings, and near-complete prevalence invariance, whereas traditional metrics exhibit statistically significant threshold drift and structural collapse. A credit-default application confirms these results: RES metrics yield interpretable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Probability and Risk Models · Insurance, Mortality, Demography, Risk Management
