Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data
Huan Qing

TL;DR
This paper introduces a new statistical test for accurately determining the number of latent classes in ordinal categorical data models, which is crucial for social science research.
Contribution
It proposes a novel test statistic based on the largest singular value of a residual matrix, enabling consistent estimation of the true number of latent classes.
Findings
Test statistic converges to zero under correct class number
Statistic exceeds a positive threshold under under-fitting
Algorithms reliably estimate the true number of classes
Abstract
Ordinal categorical data are widely collected in psychology, education, and other social sciences, appearing commonly in questionnaires, assessments, and surveys. Latent class models provide a flexible framework for uncovering unobserved heterogeneity by grouping individuals into homogeneous classes based on their response patterns. A fundamental challenge in applying these models is determining the number of latent classes, which is unknown and must be inferred from data. In this paper, we propose one test statistic for this problem. The test statistic centers the largest singular value of a normalized residual matrix by a simple sample-size adjustment. Under the null hypothesis that the candidate number of latent classes is correct, its upper bound converges to zero in probability. Under an under-fitted alternative, the statistic itself exceeds a fixed positive constant with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Psychometric Methodologies and Testing · Statistical Methods and Bayesian Inference
