Six textbook mistakes in data analysis
Alexandros Gezerlis, Martin Williams

TL;DR
This paper identifies and corrects six common misconceptions in data analysis textbooks, emphasizing the importance of accurate statistical understanding for scientific and engineering data interpretation.
Contribution
It highlights widespread textbook errors in statistical methods and provides clear corrections, improving foundational understanding in data analysis education.
Findings
Six common textbook mistakes identified and corrected
Corrections applicable to a wide range of scientific and engineering data analysis
Enhances accuracy of statistical teaching and practice
Abstract
This article discusses a number of incorrect statements appearing in textbooks on data analysis, machine learning, or computational methods; the common theme in all these cases is the relevance and application of statistics to the study of scientific or engineering data; these mistakes are also quite prevalent in the research literature. Crucially, we do not address errors made by an individual author, focusing instead on mistakes that are widespread in the introductory literature. After some background on frequentist and Bayesian linear regression, we turn to our six paradigmatic cases, providing in each instance a specific example of the textbook mistake, pointers to the specialist literature where the topic is handled properly, along with a correction that summarizes the salient points. The mistakes (and corrections) are broadly relevant to any technical setting where statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistics Education and Methodologies · Multidisciplinary Science and Engineering Research · Machine Learning and Data Classification
