Sample size and power analysis for ROC AUC differences in diagnostic tests: a methodological evaluation of the Obuchowski-McClish and Hanley-McNeil methods
Busra Emir, Fatma Ezgi Can, Elif Kaymaz, Zeynep Ozel, Mehmet Goktug Efgan, Mustafa Agah Tekindal, Ferhan Elmali

TL;DR
This paper compares methods for calculating sample sizes in diagnostic test studies, showing how factors like data type and test correlation affect required participant numbers.
Contribution
The study provides evidence-based guidance on optimal methodological choices for efficient sample size planning in diagnostic accuracy studies.
Findings
Required sample sizes varied from 36 to 3,709 participants per group depending on AUC difference, data type, and correlation.
Continuous data models outperformed discrete models, requiring 24–53% fewer participants.
Strong inter-test correlation reduced sample sizes by up to 68% in continuous models.
Abstract
Sample size determination for area under the curve (AUC) comparisons in diagnostic accuracy studies requires the consideration of multiple methodological parameters. The type of diagnostic test, the nature of the data (discrete or continuous), the correlation structure between tests, and the degree of AUC differences all influence optimal study design and planning. To address these factors, this study provides comprehensive sample size and power calculations for comparing AUCs between diagnostic tests across clinically relevant scenarios. We conducted a comprehensive evaluation of sample size and power analysis for AUC comparisons under varying correlation levels (ρ = 0.30, 0.50, 0.80), data types (discrete vs. continuous), and AUC differences (ΔAUC = 0.02–0.10). The Obuchowski–McClish method was applied for discrete data, and the Hanley–McNeil approach was applied for continuous data,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSepsis Diagnosis and Treatment · Meta-analysis and systematic reviews · Reliability and Agreement in Measurement
