Statistical methods for linguistic research: Foundational Ideas - Part I

Shravan Vasishth; Bruno Nicenboim

arXiv:1601.01126·stat.AP·December 14, 2016·UIST

Statistical methods for linguistic research: Foundational Ideas - Part I

Shravan Vasishth, Bruno Nicenboim

PDF

TL;DR

This paper explains fundamental statistical hypothesis testing concepts within the frequentist framework, emphasizing proper study design, interpretation, and replication to improve linguistic research validity.

Contribution

It provides a clear, detailed explanation of hypothesis testing principles and addresses common pitfalls in linguistic experiments, promoting rigorous statistical practices.

Findings

01

Importance of adequately powered studies

02

Misconceptions about p-values clarified

03

Recommendations for best practices in linguistic research

Abstract

We present the fundamental ideas underlying statistical hypothesis testing using the frequentist framework. We begin with a simple example that builds up the one-sample t-test from the beginning, explaining important concepts such as the sampling distribution of the sample mean, and the iid assumption. Then we examine the p-value in detail, and discuss several important misconceptions about what a p-value does and does not tell us. This leads to a discussion of Type I, II error and power, and Type S and M error. An important conclusion from this discussion is that one should aim to carry out appropriately powered studies. Next, we discuss two common issues we have encountered in psycholinguistics and linguistics: running experiments until significance is reached, and the "garden-of-forking-paths" problem discussed by Gelman and others, whereby the researcher attempts to find statistical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.