Examining marginal properness in the external validation of survival models with squared and logarithmic losses
Raphael Sonabend, John Zobolas, Riccardo Be Bin, Philipp Kopper, Lukas Burk, Andreas Bender

TL;DR
This paper evaluates the theoretical and empirical properness of common survival analysis scoring rules, introduces a new properness definition, and recommends RCLL and ISBS for model validation despite practical estimation challenges.
Contribution
It introduces a marginal properness definition for survival scoring rules and assesses the properness of ISBS and RCLL, advocating their use in external validation.
Findings
RCLL is theoretically and empirically proper.
ISBS shows minor violations at small sample sizes.
Both scores are reliable for model validation despite practical challenges.
Abstract
Scoring rules promote rational and honest decision-making, which is important for model evaluation and becoming increasingly important for automated procedures such as `AutoML'. In this paper we survey common squared and logarithmic scoring rules for survival analysis, with a focus on their theoretical and empirical properness. We introduce a marginal definition of properness and show that both the Integrated Survival Brier Score (ISBS) and the Right-Censored Log-Likelihood (RCLL) are theoretically improper under this definition. We also investigate a new class of losses that may inform future survival scoring rules. Simulation experiments reveal that both the ISBS and RCLL behave as proper scoring rules in practice. The RCLL showed no violations across all settings, while ISBS exhibited only minor, negligible violations at extremely small sample sizes, suggesting one can trust results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Data Stream Mining Techniques
