Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package
Oyebayo Ridwan Olaniran, Diana Shamsutdinova, Sarah Markham, Felix Zimmer, Daniel Stahl, Gordon Forbes, Ewan Carr

TL;DR
This paper introduces pmsims, an R package utilizing adaptive Gaussian process modelling to efficiently determine sample sizes for clinical prediction models, outperforming traditional methods especially in complex scenarios.
Contribution
The paper presents a novel adaptive Gaussian process-based framework for sample size estimation, validated through extensive simulations and benchmarking against existing methods.
Findings
pmsims's Gaussian process method yields stable sample size estimates.
pmsims matches or exceeds performance of existing methods in challenging scenarios.
The framework requires fewer evaluations than non-adaptive approaches.
Abstract
Background: Determining an adequate sample size is essential for developing reliable and generalisable clinical prediction models, yet practical guidance on selecting appropriate methods remains limited. Existing analytical and simulation-based approaches often rely on restrictive assumptions and focus on mean-based criteria. We present and validate pmsims, an R package that uses Gaussian process surrogate modelling to provide a flexible and computationally efficient simulation-based framework for sample size determination across diverse prediction settings. Methods: We conducted a comprehensive simulation study with two aims. First, we compared three search engines implemented in pmsims: a Gaussian process-based adaptive method, a deterministic bisection method, and a hybrid approach, across binary, continuous, and survival outcomes. Second, we benchmarked the best-performing pmsims…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Sepsis Diagnosis and Treatment · Artificial Intelligence in Healthcare and Education
