Quantifying disparities in intimate partner violence: a machine learning method to correct for underreporting
Divya Shanmugam, Kaihua Hou, Emma Pierson

TL;DR
This paper introduces a machine learning method to accurately estimate the relative prevalence of underreported medical conditions across groups, addressing reporting biases to inform equitable health policies.
Contribution
It extends positive unlabeled learning to estimate relative prevalence under covariate shift, even when absolute prevalence cannot be recovered, and demonstrates robustness and practical applications.
Findings
Method outperforms baselines in synthetic and real data
Accurately estimates relative prevalence despite underreporting
Shows robustness to violations of covariate shift assumption
Abstract
Estimating the prevalence of a medical condition, or the proportion of the population in which it occurs, is a fundamental problem in healthcare and public health. Accurate estimates of the relative prevalence across groups -- capturing, for example, that a condition affects women more frequently than men -- facilitate effective and equitable health policy which prioritizes groups who are disproportionately affected by a condition. However, it is difficult to estimate relative prevalence when a medical condition is underreported. In this work, we provide a method for accurately estimating the relative prevalence of underreported medical conditions, building upon the positive unlabeled learning framework. We show that under the commonly made covariate shift assumption -- i.e., that the probability of having a disease conditional on symptoms remains constant across groups -- we can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsViral Infections and Vectors · Data Analysis with R
