COMBO and COMMA: R packages for regression modeling and inference in the presence of misclassified binary mediator or outcome variables
Kimberly A. Hochstedler Webb, Martin T. Wells

TL;DR
This paper introduces R packages COMBO and COMMA that implement bias-correction methods for misclassified binary variables in regression models, addressing data quality issues without requiring gold standard measures.
Contribution
The paper presents novel likelihood-based R packages that correct for misclassification bias in binary outcomes and mediators, including automatic label switching correction.
Findings
COMBO effectively corrects bias in a study of bar exam passage.
COMBO improves risk prediction models with noisy indicators.
COMMA evaluates mediating effects with misdiagnosed variables.
Abstract
Misclassified binary outcome or mediator variables can cause unpredictable bias in resulting parameter estimates. As more datasets that were not originally collected for research purposes are being used for studies in the social and health sciences, the need for methods that address data quality concerns is growing. In this paper, we describe two R packages, COMBO and COMMA, that implement bias-correction methods for misclassified binary outcome and mediator variables, respectively. These likelihood-based approaches do not require gold standard measures and allow for estimation of sensitivity and specificity rates for the misclassified variable(s). In addition, these R packages automatically apply crucial label switching corrections, allowing researchers to circumvent the inherent permutation invariance of the misclassification model likelihood. We demonstrate COMBO for single-outcome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
