Critical issues with the Pearson's chi-square test

Vladimir Gurvich; Mariya Naumova

arXiv:2505.06318·stat.ME·May 13, 2025

Critical issues with the Pearson's chi-square test

Vladimir Gurvich, Mariya Naumova

PDF

Open Access

TL;DR

This paper critically examines the widespread misuse of Pearson's chi-square test, highlighting its non-invariance under scaling and the implications for its validity in various scientific applications.

Contribution

It reveals fundamental issues with the invariance property of Pearson's chi-square test, questioning its reliability when applied to scaled contingency tables.

Findings

01

The chi-square statistic is not invariant under scaling of data.

02

Scaling data can arbitrarily affect the test outcome.

03

The current usage of chi-square tests may lead to incorrect conclusions.

Abstract

Pearson's chi-square tests are among the most commonly applied statistical tools across a wide range of scientific disciplines, including medicine, engineering, biology, sociology, marketing and business. However, its usage in some areas is not correct. For example, the chi-square test for homogeneity of proportions (that is, comparing proportions across groups in a contingency table) is frequently used to verify if the rows of a given nonnegative $m \times n$ (contingency) matrix $A$ are proportional. The null-hypothesis $H_{0}$ : `` $m$ rows are proportional'' (for the whole population) is rejected with confidence level $1 - α$ if and only if $χ_{s t a t}^{2} > χ_{cr i t}^{2}$ , where the first term is given by Pearson's formula, while the second one depends only on $m, n$ , and $α$ , but not on the entries of $A$ . It is immediate to notice that the Pearson's formula is not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Sensory Analysis and Statistical Methods · SAS software applications and methods