A Monte Carlo comparison of categorical tests of independence
Abdulaziz Alenazi

TL;DR
This study compares chi-squared and G-squared tests for categorical independence, recommending chi-squared for small samples and permutation G-squared for conditional independence, based on extensive Monte Carlo simulations.
Contribution
It provides the first comprehensive Monte Carlo comparison of these tests, clarifying their applicability and limitations in different sample size scenarios.
Findings
Chi-squared and G-squared tests are similar in large samples.
Chi-squared test is recommended for small samples regardless of zero frequencies.
Permutation G-squared is suggested for conditional independence testing.
Abstract
The and tests are the most frequently applied tests for testing the independence of two categorical variables. However, no one, to the best of our knowledge has compared them, extensively, and ultimately answer the question of which to use and when. Further, their applicability in cases with zero frequencies has been debated and (non parametric) permutation tests are suggested. In this work we perform extensive Monte Carlo simulation studies attempting to answer both aforementioned points. As expected, in large sample sized cases () the and are indistinguishable. In the small sample sized cases () though, we provide strong evidence supporting the use of the test regardless of zero frequencies for the case of unconditional independence. Also, we suggest the use of the permutation based test for testing conditional independence, at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Multi-Criteria Decision Making · Fuzzy Systems and Optimization
