Statistical Tests and Research Assessments: A comment on Schneider (2012)
Lutz Bornmann, Loet Leydesdorff

TL;DR
This paper discusses the role of statistical tests in research assessments, supporting the addition of effect size measures alongside significance testing to improve evaluation accuracy.
Contribution
It advocates for combining statistical power analysis and effect size measures with significance tests in research evaluations.
Findings
Effect size measures were incorporated into online testing tools.
Statistical power analysis enhances research assessment methods.
Significance testing remains a valuable component in evaluations.
Abstract
In a recent presentation at the 17th International Conference on Science and Technology Indicators, Schneider (2012) criticised the proposal of Bornmann, de Moya Anegon, and Leydesdorff (2012) and Leydesdorff and Bornmann (2012) to use statistical tests in order to evaluate research assessments and university rankings. We agree with Schneider's proposal to add statistical power analysis and effect size measures to research evaluations, but disagree that these procedures would replace significance testing. Accordingly, effect size measures were added to the Excel sheets that we bring online for testing performance differences between institutions in the Leiden Ranking and the SCImago Institutions Ranking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Diverse Academic Research Areas · Reliability and Agreement in Measurement
