Non-parametric targeted Bayesian estimation of class proportions in unlabeled data
Iv\'an D\'iaz, Oleksander Savenkov, Hooman Kamel

TL;DR
This paper presents a new Bayesian method for estimating class proportions in unlabeled data, focusing on low-dimensional parameters, with theoretical guarantees and practical applications demonstrated through simulations and a medical case study.
Contribution
It introduces a targeted Bayesian estimator that focuses on specific parameters, providing asymptotic efficiency and robustness, unlike traditional full-data Bayesian approaches.
Findings
Posterior distribution converges to an efficient, Gaussian limit.
Method is doubly robust and asymptotically normal.
Numerical studies confirm favorable frequentist properties.
Abstract
We introduce a novel Bayesian estimator for the class proportion in an unlabeled dataset, based on the targeted learning framework. Our procedure requires the specification of a prior (and outputs a posterior) only for the target of inference, instead of the prior (and posterior) on the full-data distribution employed by classical non-parametric Bayesian methods .When the scientific question can be characterized by a low-dimensional parameter functional, focus on such a prior and posterior distributions is more aligned with Bayesian subjectivism, compared to focus on entire data distributions. We prove a Bernstein-von Mises-type result for our proposed Bayesian procedure, which guarantees that the posterior distribution converges to the distribution of an efficient, asymptotically linear estimator. In particular, the posterior is Gaussian, doubly robust, and efficient in the limit,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
