TL;DR
This paper introduces a multiple testing framework for diagnostic accuracy studies with co-primary endpoints, improving model evaluation and selection in AI-based medical diagnostics through statistical control and Bayesian planning.
Contribution
It proposes a novel simultaneous testing procedure for co-primary endpoints and a Bayesian method for optimal model evaluation planning, enhancing diagnostic accuracy assessment.
Findings
Asymptotic control of family-wise error rate achieved
Simulation shows improved model selection and power
Bayesian approach optimizes number of models evaluated
Abstract
Major advances have been made regarding the utilization of artificial intelligence in health care. In particular, deep learning approaches have been successfully applied for automated and assisted disease diagnosis and prognosis based on complex and high-dimensional data. However, despite all justified enthusiasm, overoptimistic assessments of predictive performance are still common. Automated medical testing devices based on machine-learned prediction models should thus undergo a throughout evaluation before being implemented into clinical practice. In this work, we propose a multiple testing framework for (comparative) phase III diagnostic accuracy studies with sensitivity and specificity as co-primary endpoints. Our approach challenges the frequent recommendation to strictly separate model selection and evaluation, i.e. to only assess a single diagnostic model in the evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
