Sex, lies and self-reported counts: Bayesian mixture models for heaping in longitudinal count data via birth-death processes
Forrest W. Crawford, Robert E. Weiss, Marc A. Suchard

TL;DR
This paper introduces a Bayesian hierarchical model to correct for heaping in self-reported count data, improving the accuracy of estimates in longitudinal surveys with covariates.
Contribution
It proposes a novel interpretable reporting distribution and a Bayesian framework to model heaping and misremembering in longitudinal count data.
Findings
Effective correction for heaping in self-reported counts.
Application to HIV-positive youth data demonstrated improved inference.
Model accommodates various heaping grids and quasi-heaping.
Abstract
Surveys often ask respondents to report nonnegative counts, but respondents may misremember or round to a nearby multiple of 5 or 10. This phenomenon is called heaping, and the error inherent in heaped self-reported numbers can bias estimation. Heaped data may be collected cross-sectionally or longitudinally and there may be covariates that complicate the inferential task. Heaping is a well-known issue in many survey settings, and inference for heaped data is an important statistical problem. We propose a novel reporting distribution whose underlying parameters are readily interpretable as rates of misremembering and rounding. The process accommodates a variety of heaping grids and allows for quasi-heaping to values nearly but not equal to heaping multiples. We present a Bayesian hierarchical model for longitudinal samples with covariates to infer both the unobserved true distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
