Semiparametric clustered overdispersed multinomial goodness-of-fit of log-linear models
Juana M. Alonso-Revenga, Nirian Martin, Leandro Pardo

TL;DR
This paper develops new distribution-free goodness-of-fit tests for log-linear models applied to clustered contingency tables, accounting for potential intra-cluster correlation without relying on the Dirichlet-multinomial assumption.
Contribution
It introduces novel test-statistics that do not depend on specific distributional assumptions, utilizing an estimator for intracluster correlation across varying cluster sizes.
Findings
New test-statistics effectively test log-linear hypotheses without distributional constraints
The proposed methods accommodate homogeneously correlated individuals within clusters
The approach extends goodness-of-fit testing beyond Dirichlet-multinomial models
Abstract
Traditionally, the Dirichlet-multinomial distribution has been recognized as a key model for contingency tables generated by cluster sampling schemes. There are, however, other possible distributions appropriate for these contingency tables. This paper introduces new test-statistics capable to test log-linear modeling hypotheses with no distributional specification, when the individuals of the clusters are possibly homogeneously correlated. The estimator for the intracluster correlation coefficient proposed in Alonso-Revenga et al. (2016), valid for different cluster sizes, plays a crucial role in the construction of the goodness-of-fit test-statistic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models · Data-Driven Disease Surveillance
