Interval Estimation for Messy Observational Data
Paul Gustafson, Sander Greenland

TL;DR
This paper discusses Bayesian and frequentist interval estimation methods in complex observational data scenarios, highlighting Bayesian approaches with calibration diagnostics as effective tools for nonidentifiable models.
Contribution
It introduces the use of Bayesian interval estimation with calibration diagnostics for messy observational data where traditional methods struggle due to nonidentifiability.
Findings
Bayesian methods provide useful diagnostics in messy data contexts.
Calibration-sensitivity analysis aids in assessing estimate robustness.
Application to silica exposure study demonstrates practical utility.
Abstract
We review some aspects of Bayesian and frequentist interval estimation, focusing first on their relative strengths and weaknesses when used in "clean" or "textbook" contexts. We then turn attention to observational-data situations which are "messy," where modeling that acknowledges the limitations of study design and data collection leads to nonidentifiability. We argue, via a series of examples, that Bayesian interval estimation is an attractive way to proceed in this context even for frequentists, because it can be supplied with a diagnostic in the form of a calibration-sensitivity simulation analysis. We illustrate the basis for this approach in a series of theoretical considerations, simulations and an application to a study of silica exposure and lung cancer.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
