Analysis and Methods to Mitigate Effects of Under-reporting in Count Data
Jennifer Brennan, Marlena Bannick, Nicholas Kassebaum, Lauren Wilner,, Azalea Thomson, Aleksandr Aravkin, Peng Zheng

TL;DR
This paper improves the Pogit model for count data under-reporting by adding constraints and robust uncertainty quantification, validated on synthetic and healthcare data, enabling better separation of true counts from under-reporting effects.
Contribution
It introduces enhanced estimation constraints and robust uncertainty quantification techniques for the Pogit model to better handle under-reporting in count data.
Findings
Improved model accuracy on synthetic data
Effective application to healthcare datasets
Open source Python implementation available
Abstract
Under-reporting of count data poses a major roadblock for prediction and inference. In this paper, we focus on the Pogit model, which deconvolves the generating Poisson process from the censuring process controlling under-reporting using a generalized linear modeling framework. We highlight the limitations of the Pogit model and address them by adding constraints to the estimation framework. We also develop uncertainty quantification techniques that are robust to model mis-specification. Our approach is evaluated using synthetic data and applied to real healthcare datasets, where we treat in-patient data as `reported' counts and use held-out total injuries to validate the results. The methods make it possible to separate the Poisson process from the under-reporting process, given sufficient expert information. Codes to implement the approach are available via an open source Python…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Systems, Economic Evaluations, Quality of Life · Healthcare Policy and Management · Statistical Methods and Bayesian Inference
