A decomposition of Fisher's information to inform sample size for developing fair and precise clinical prediction models -- Part 2: time-to-event outcomes
Richard D Riley, Gary S Collins, Lucinda Archer, Rebecca Whittle,, Amardeep Legha, Laura Kirton, Paula Dhiman, Mohsen Sadatsafavi, Nicola J, Adderley, Joseph Alderman, Glen P Martin, Joie Ensor

TL;DR
This paper introduces a Fisher information-based method to determine the optimal sample size for developing fair and precise time-to-event clinical prediction models, enhancing reliability and addressing fairness concerns.
Contribution
It proposes a novel decomposition of Fisher's information matrix to calculate sample size requirements for accurate and fair individual risk estimates in time-to-event models.
Findings
Closed-form solutions for variance decomposition of individual risk estimates
Illustrative example in breast cancer demonstrating clinical relevance
Empirical evaluations showing uncertainty intervals align with flexible models
Abstract
Background: When developing a clinical prediction model using time-to-event data, previous research focuses on the sample size to minimise overfitting and precisely estimate the overall risk. However, instability of individual-level risk estimates may still be large. Methods: We propose a decomposition of Fisher's information matrix to examine and calculate the sample size required for developing a model that aims for precise and fair risk estimates. We propose a six-step process which can be used before data collection or when an existing dataset is available. Steps (1) to (5) require researchers to specify the overall risk in the target population at a key time-point of interest; an assumed pragmatic 'core model' in the form of an exponential regression model; the (anticipated) joint distribution of core predictors included in that model; and the distribution of any censoring.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning in Healthcare · Statistical Methods in Clinical Trials
