Binomial Mixture Model With U-shape Constraint
Yuting Ye, Peter J. Bickel

TL;DR
This paper introduces a novel method called Ucut for estimating distributions in binomial mixture models with a U-shape constraint, especially when the binomial size is large relative to the sample size, improving accuracy in noisy data scenarios.
Contribution
It proposes a new estimation method Ucut utilizing the U-shape property and Grenander estimator, with theoretical convergence guarantees for large binomial sizes.
Findings
Ucut accurately recovers distributions in simulations
Convergence rate of O(n^{-1/3}) for cutoff estimation
Effective on real gene expression datasets
Abstract
In this article, we study the binomial mixture model under the regime that the binomial size can be relatively large compared to the sample size . This project is motivated by the GeneFishing method (Liu et al., 2019), whose output is a combination of the parameter of interest and the subsampling noise. To tackle the noise in the output, we utilize the observation that the density of the output has a U shape and model the output with the binomial mixture model under a U shape constraint. We first analyze the estimation of the underlying distribution F in the binomial mixture model under various conditions for F. Equipped with these theoretical understandings, we propose a simple method Ucut to identify the cutoffs of the U shape and recover the underlying distribution based on the Grenander estimator (Grenander, 1956). It has been shown that when , the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Crystallization and Solubility Studies
