Accommodating heterogeneous missing data patterns for prostate cancer risk prediction
Matthias Neumair (1), Michael W. Kattan (2), Stephen J. Freedland (3, and 4), Alexander Haese (5), Lourdes Guerrios-Rivera (6), Amanda M. De Hoedt, (3), Michael A. Liss (7), Robin J. Leach (8), Stephen A. Boorjian (9),, Matthew R. Cooperberg (10), Cedric Poyet (11)

TL;DR
This study compares methods for handling missing data in prostate cancer risk prediction across diverse cohorts, developing an online tool that effectively manages missing risk factors and demonstrates good predictive performance.
Contribution
It introduces an optimal missing data handling approach using the available cases method for prostate cancer risk prediction, validated across multiple cohorts and implemented in an online tool.
Findings
Available cases method achieved best calibration and discrimination.
The online risk tool requires only PSA and age as mandatory inputs.
Imputation methods performed poorly in calibration.
Abstract
Objective: We compared six commonly used logistic regression methods for accommodating missing risk factor data from multiple heterogeneous cohorts, in which some cohorts do not collect some risk factors at all, and developed an online risk prediction tool that accommodates missing risk factors from the end-user. Study Design and Setting: Ten North American and European cohorts from the Prostate Biopsy Collaborative Group (PBCG) were used for fitting a risk prediction tool for clinically significant prostate cancer, defined as Gleason grade group greater or equal 2 on standard TRUS prostate biopsy. One large European PBCG cohort was withheld for external validation, where calibration-in-the-large (CIL), calibration curves, and area-underneath-the-receiver-operating characteristic curve (AUC) were evaluated. Ten-fold leave-one-cohort-internal validation further validated the optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProstate Cancer Diagnosis and Treatment · Prostate Cancer Treatment and Research · Statistical Methods in Clinical Trials
