TL;DR
This paper examines how to properly evaluate survival distribution predictions using discrimination measures, highlighting the lack of standard methods and proposing a robust approach based on summing predicted cumulative hazards.
Contribution
It surveys existing methods for evaluating survival distributions with discrimination measures and recommends a standardized, transparent approach for deriving risk predictions from distributions.
Findings
Summing over the predicted cumulative hazard is the most robust method.
Current evaluation practices are often unclear and lead to unfair comparisons.
Software should implement clear transformations for better model evaluation.
Abstract
In this paper we consider how to evaluate survival distribution predictions with measures of discrimination. This is a non-trivial problem as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons. We find that the most robust method of reducing a distribution to a risk is to sum over the predicted cumulative hazard. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
