TL;DR
This large-scale, neutral benchmark study evaluates 19 survival models on low-dimensional, right-censored data, revealing that the Cox model remains a robust choice despite many advanced methods.
Contribution
It provides the first comprehensive, empirical comparison of diverse survival models on a large dataset collection with standardized tuning and evaluation procedures.
Findings
No method significantly outperforms Cox model in predictive accuracy.
Oblique random survival forests and boosting methods show strong average ranks.
Cox proportional hazards model remains sufficient for most low-dimensional survival tasks.
Abstract
This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are smaller in scale regarding the number of used datasets and extent of empirical evaluation. They often lack appropriate tuning or evaluation procedures, while other comparison studies focus on qualitative reviews rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable guidelines for practitioners. We benchmark 19 models, ranging from classical statistical approaches to many common machine learning methods, on 34 publicly available datasets.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
