A decomposition of Fisher's information to inform sample size for developing fair and precise clinical prediction models -- part 1: binary outcomes
Richard D Riley, Gary S Collins, Rebecca Whittle, Lucinda Archer, Kym, IE Snell, Paula Dhiman, Laura Kirton, Amardeep Legha, Xiaoxuan Liu, Alastair, Denniston, Frank E Harrell Jr, Laure Wynants, Glen P Martin, Joie Ensor

TL;DR
This paper introduces a method to determine the necessary sample size for developing clinical prediction models that provide precise individual risk estimates, enhancing fairness and reliability.
Contribution
It presents a novel approach using Fisher's information to decompose risk estimate variance, guiding sample size decisions for fairer, more stable clinical models.
Findings
Decomposition of risk estimate variance using Fisher's information.
Closed-form solutions for sample size calculation.
Software implementation for practical application.
Abstract
When developing a clinical prediction model, the sample size of the development dataset is a key consideration. Small sample sizes lead to greater concerns of overfitting, instability, poor performance and lack of fairness. Previous research has outlined minimum sample size calculations to minimise overfitting and precisely estimate the overall risk. However even when meeting these criteria, the uncertainty (instability) in individual-level risk estimates may be considerable. In this article we propose how to examine and calculate the sample size required for developing a model with acceptably precise individual-level risk estimates to inform decisions and improve fairness. We outline a five-step process to be used before data collection or when an existing dataset is available. It requires researchers to specify the overall risk in the target population, the (anticipated) distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Systems, Economic Evaluations, Quality of Life
