An introduction to how chi-square and classical exact tests often wildly   misreport significance and how the remedy lies in computers

William Perkins; Mark Tygert; and Rachel Ward

arXiv:1201.1431·stat.ME·April 9, 2024·1 cites

An introduction to how chi-square and classical exact tests often wildly misreport significance and how the remedy lies in computers

William Perkins, Mark Tygert, and Rachel Ward

PDF

Open Access

TL;DR

This paper highlights the limitations of traditional chi-square and exact tests in goodness-of-fit testing for discrete distributions and advocates for Euclidean-based tests, which are more powerful and now practically computable with modern software.

Contribution

It demonstrates that Euclidean distance-based goodness-of-fit tests outperform classical chi-square and exact tests, especially for non-uniform distributions, and emphasizes the role of computers in implementing these tests.

Findings

01

Euclidean-based tests outperform classical tests by at least an order of magnitude.

02

Modern computers enable practical calculation of Euclidean goodness-of-fit significance.

03

Euclidean tests are more reliable for non-uniform discrete distributions.

Abstract

Goodness-of-fit tests based on the Euclidean distance often outperform chi-square and other classical tests (including the standard exact tests) by at least an order of magnitude when the model being tested for goodness-of-fit is a discrete probability distribution that is not close to uniform. The present article discusses numerous examples of this. Goodness-of-fit tests based on the Euclidean metric are now practical and convenient: although the actual values taken by the Euclidean distance and similar goodness-of-fit statistics are seldom humanly interpretable, black-box computer programs can rapidly calculate their precise significance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDiverse Scientific and Engineering Research · Advanced Statistical Modeling Techniques · Advanced Statistical Methods and Models