Evidence for goodness of fit in Karl Pearson chi-squared statistics
Robert G. Staudte

TL;DR
This paper proposes an equivalence testing approach for chi-squared goodness-of-fit, providing a new interpretation and calibration scale for the statistic, with applications to random number generators and digit analysis.
Contribution
It introduces an equivalence testing framework for chi-squared statistics, offering a new way to assess goodness of fit beyond traditional significance testing.
Findings
Evidence can distinguish between normal and nearby models.
Method applies to Poisson and over-dispersed models.
Sample size guidelines for desired evidence levels.
Abstract
Chi-squared tests for lack of fit are traditionally employed to find evidence against a hypothesized model, with the model accepted if the Karl Pearson statistic comparing observed and expected numbers of observations falling within cells is not significantly large. However, if one really wants evidence for goodness of fit, it is better to adopt an equivalence testing approach in which small values of the chi-squared statistic are evidence for the desired model. This method requires one to define what is meant by equivalence to the desired model, and guidelines are proposed. Then a simple extension of the classical normalizing transformation for the non-central chi-squared distribution places these values on a simple to interpret calibration scale for evidence. It is shown that the evidence can distinguish between normal and nearby models, as well between the Poisson and over-dispersed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
