A Statistical Significance Simulation Study for the General Scientist
Jacob Levman

TL;DR
This study visually demonstrates that the common p < 0.05 threshold for statistical significance can lead to false positives, especially at large sample sizes, by analyzing random Gaussian noise data with t-tests.
Contribution
It reveals that standard significance criteria may produce false positives at large sample sizes, challenging conventional interpretation of statistical results.
Findings
Statistically significant results can occur by chance in random data.
Large sample sizes increase the likelihood of false positives.
The study highlights potential misinterpretations in scientific research.
Abstract
When a scientist performs an experiment they normally acquire a set of measurements and are expected to demonstrate that their results are "statistically significant" thus confirming whatever hypothesis they are testing. The main method for establishing statistical significance involves demonstrating that there is a low probability that the observed experimental results were the product of random chance. This is typically defined as p < 0.05, which indicates there is less than a 5% chance that the observed results occurred randomly. This research study visually demonstrates that the commonly used definition for "statistical significance" can erroneously imply a significant finding. This is demonstrated by generating random Gaussian noise data and analyzing that data using statistical testing based on the established two-sample t-test. This study demonstrates that insignificant yet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews · Health Sciences Research and Education · Clinical Reasoning and Diagnostic Skills
