$statcheck$ is flawed by design and no valid spell checker for statistical results
Ingmar B\"oschen

TL;DR
This paper critically evaluates the R package statcheck, revealing it is inherently flawed by design and unsuitable as a reliable spell checker for statistical results due to its strict adherence to APA reporting standards.
Contribution
The study demonstrates the limitations of statcheck's detection heuristic and argues for more flexible algorithms, such as those using LLMs, to improve statistical result checking.
Findings
statcheck fails to detect many variations of reported results
its heuristic is limited to APA-style reporting
more adaptable tools are needed for effective checking
Abstract
The R package is designed to extract statistical test results from text and check the consistency of the reported test statistics and corresponding p-values. Recently, it has also been featured as a spell checker for statistical results, aimed at improving reporting accuracy in scientific publications. In this study, I perform a check on using a non-exhaustive list of 187 simple text strings with arbitrary statistical test results. These strings represent a wide range of textual representations of results including correctly manageable results, non-targeted test statistics, variable reporting styles, and common typos. Since 's detection heuristic is tied to a specific set of statistical test results that strictly adhere to the American Psychological Association (APA) reporting guidelines, it is unable to detect and check any reported result that even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R
