Identifying the Most Appropriate Order for Categorical Responses
Tianmeng Wang, Jie Yang

TL;DR
This paper explores how the order of categorical response options can be statistically identified and modeled, improving predictive accuracy when an underlying order exists, even if not initially apparent.
Contribution
It introduces likelihood-based criteria for selecting the true order of response categories and analyzes the implications for multinomial logistic models.
Findings
Likelihood criteria can identify the true response order when it exists.
Models with the correct order outperform nominal models in prediction.
Certain orders are theoretically equivalent and indistinguishable by likelihood.
Abstract
Categorical responses arise naturally within various scientific disciplines. In many circumstances, there is no predetermined order for the response categories, and the response has to be modeled as nominal. In this study, we regard the order of response categories as part of the statistical model, and show that the true order, when it exists, can be selected using likelihood-based model selection criteria. For predictive purposes, a statistical model with a chosen order may outperform models based on nominal responses, even if a true order does not exist. For multinomial logistic models, widely used for categorical responses, we show the existence of theoretically equivalent orders that cannot be differentiated based on likelihood criteria, and determine the connections between their maximum likelihood estimators. We use simulation studies and a real-data analysis to confirm the need…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
