A finite mixture model approach to regression under covariate misclassification
P. Richard Hahn, Michelle Xia

TL;DR
This paper introduces a finite mixture model approach for regression analysis that effectively accounts for covariate misclassification, enabling accurate parameter estimation even without validation data, and demonstrates its application to HIV-related health data.
Contribution
The paper develops a mixture model framework for regression with misclassified covariates, allowing valid inference without validation data and incorporating external error information.
Findings
Significant effect of cocaine use on pulmonary complications after adjustment for misclassification.
Method performs well even without validation data on misclassification.
Bayesian inference enables integration of external error information.
Abstract
This paper considers the problem of mismeasured categorical covariates in the context of regression modeling; if unaccounted for, such misclassification is known to result in misestimation of model parameters. Here, we exploit the fact that explicitly modeling covariate misclassification leads to a mixture representation. Assuming common parametric families for the mixture components, and assuming that the misclassification occurrence is independent of the response variable, the mixture representation permits model parameters to be identified even when misclassification probabilities are unknown. Previous approaches to covariate misclassification use multiple surrogate covariates and/or validation data on the magnitude of errors. Based on this mixture structure, we demonstrate that valid inference can be performed on all the parameters even when no such additional information is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPneumonia and Respiratory Infections · Bayesian Methods and Mixture Models · Data-Driven Disease Surveillance
